MG 315 PU Unit 8 Statistics Regression Analysis Project Report

select any city in the US and select 40-45 houses that are for sale in that city. Any website can be used for example zillow.com, realtor.com, ect. I need to create an excel spreadsheet with a price range for example $100K to $500K with my dependent variable= Price of the home and Independent variable= SQFT, number of bedrooms, number of bathrooms, and age.

1) In the introductory paragraph, state why the dependent variable has been chosen for analysis. Then make a general statement about the model:

“The dependent variable _Price of the home__ is determined by variables _SQFT__, _number of bathrooms__, _number of bedrooms__, and _age__.”

2) In the second paragraph, identify the primary independent variable and defend why it is important.

“The most important variable in this analysis is ________ because _________.” In this paragraph, cite and discuss the two research sources that support the thesis, i.e., the model.

3) Write the general form of the regression model (less intercept and coefficients), with the variables named appropriately so reader can identify each variable at a glance:

Dep_Var = Ind_Var_1 + Ind_Var_2 + Ind_Var_3

For instance, a typical model would be written:

Price_of_Home = Square_Footage + Number_Bedrooms + Lot_Size

Where

Price_of_Home: brief definition of dependent variable

Square_Footage: brief definition of first independent variable

Number_Bedrooms: brief definition of second independent variable

Age: brief definition of third independent variable

4) Define and defend all variables, including the dependent variable, in a single paragraph for each variable. Also, state the expectations for each independent variable. These paragraphs should be in numerical order, i.e., dependent variable, X1, then X2, etc.

In each paragraph, the following should be addressed:

- How is the variable defined in the data source?
- Which unit of measurement is used?
- For the independent variables: why does the variable determine Y?
- What sign is expected for the independent variable’s coefficient, positive or negative? Why?

5) In one paragraph, describe the data and identify the data sources.

- From which general sources and from which specific tables are the data taken? (Citing a website is not acceptable.)
- Which year or years were the data collected?
- Are there any data limitations?

6) Write the regression (prediction) equation:

Dep_Var = Intercept + c1 * Ind_Var_1 + c2 * Ind_Var_2 + c3* Ind_Var_3

7) Identify and interpret the adjusted R2 (one paragraph):

- Define “adjusted R2.”
- What does the value of the adjusted R2 reveal about the model?
- If the adjusted R2 is low, how has the choice of independent variables created this result?

8) Identify and interpret the F test (one paragraph):

- Using the p-value approach, is the null hypothesis for the F test rejected or not rejected? Why or why not?
- Interpret the implications of these findings for the model.

9) Identify and interpret the t tests for each of the coefficients (one separate paragraph for each variable, in numerical order):

- Are the signs of the coefficients as expected? If not, why not?
- For each of the coefficients, interpret the numerical value.
- Using the p-value approach, is the null hypothesis for the t test rejected or not rejected for each coefficient? Why or why not?
- Interpret the implications of these findings for the variable.
- Identify the variable with the greatest significance.

10) Analyze multicollinearity of the independent variables (one paragraph):

- Generate the correlation matrix.
- Define multicollinearity.
- Are any of the independent variables highly correlated with each other? If so, identify the variables and explain why they are correlated.
- State the implications of multicollinearity (if found) for the model.

