Data Analysis Project Part 2

Part 2 (*** I have attached part 1 paper turned in to help along with the excel sheet )
— Remember to comment on the findings of everything and not just paste in charts.
– code the categorical variables that have a natural order into a numeric variable having values 0,1,2,..
— randomly select 700 of the 1000 records and use them as your Training Data to fit predictive models, and 300 of the 1000 records as test data to test the generality of use of the models fit to the training data
— check for relationships between the text values of the remaining categorical variables (also known as factor levels) and the response variable of interest this week, price. If any of these textual factor levels seem to be related to the price, then the video shows how to add a new variable 0/1 column to indicate whether each row record has these characteristics or not.
— Now having done that, you create a correlation matrix between all of these numeric variables.
— Based on which of these are at least somewhat correlated with price, you will select these to use as input X variables in your linear regression model. If any of the input X variables are very highly related to each other, then it is suggested that you might just use one or the other of them in your model to predict price. If some variable has a modest correlation with price, say in the .1 to .2 range, but is not correlated with any other variable, then this variable represents unique information that might be included in your model to predict price.
— After you decide which X variables to include, you will need to rearrange your columns so that these chosen X variables are all next to each other. Then run the regression in Data Analysis add-in. When you run that add-in, select the ‘residuals’ option to print those out along with the regression results.
— Compute the mean absolute error (MAE) to see how accurate your model is for the training data. Then, as shown in the video, use this same model equation on the test data and determine how good a fit this model is to new data not included in the model fitting training data.

Criteria
— The paper meets basic writing standards, including grammar, usage, spelling, punctuation, and organization.
— The document includes all required analyses.
–This paper includes discussion of findings from the analyses performed, presented in a well organized manner.

Notes from Part 1 ( attached )
**Comments
–I don’t see pivot tables, and your scatter plots shouldn’t have lines. It’s also difficult to understand the charts without reading the paragraphs; in future reports, please include clear titles and axis labels so that the reader can easily understand what they’re looking at.

Albright, S. C.,

Order Plagiarism-Free Paper

Tags: Statistics

PapersSpot

Data Analysis Project Part 2

Explore More topics

Order For an Original Customized Paper

Archives