Explain the Data set
Visualize the Data
Visualize and analyze the results
What algorithms you are going to use in this research
Explain the code using R Studio
Paper should be of 12 pages
15 PPT slides explaining about the project
APA format with Scholarly references and citations
â€“Marks secured by the students
â€“This data set consists of the marks secured by the students in various subjects.
â€“Understand the influence of the parents background, test preparation etc on students performance
â€“Use the student data of test results, create a fictitous variable of pass or fail.
â€“Predict whether a student passes or fails using these any classification method
Check under kernels tab
Using the link above, follow the same steps:
Reading the dataset
Visualizing the dataset , please use visualization 1 ,2,3 or 4 as an example.
You can just do 1 visualization and explain why you want to choose only that visualization based on factors such as ethinicity, parental education, test prep course etc.
Then summarize about the visualization you did
Plot graphs for each visualization
You may choose bar/box/histogram or scatter which is easily understood by the user
Next step is build a model using the visualization
You can use either decision tree/random forest, linear regression or k means algorithm which ever algorithm will give the best accuracy for the model
Explain why you would choose that particular algorithm and not others
Explain confusion matrix for each of those
Final aim is that this dataset should be easily understood by anyone who sees the plots and good for business
Make sure you go through the link below and refer to kernels section for examples and build visualization and model on your own
Please perform each and every step including reading the dataset