Part I
Regression Analysis of Crime Rate and Free Lunches (fig. 1) |
Part II
Introduction
For this next exercise we are to look at data from the UW system and find which variables best describe why students choose the school they do based on what county they are from. Some of the variables we will be looking at include number of people within the county that have some college, 2 years of college, college degree, graduate/professional degree, population, population 18-24, median household income. The focus will be on two specific schools that I picked, UW-Eau Claire and UW-Madison and focusing on the variables of percentage BS degree, median household income, and population normalized by distance from school. We will exclude any students that come from out-of-state in this analysis.
Methods
The first thing to do is to perform regression analysis on each of the three variables for both schools. For each regession analysis output we can tell if the variable is significant if the significant value is below .05. First I state the null and alternative hypotheses for both schools for each of the three variables.
Eau Claire student attendance and Population normalized by distance for Eau Claire.
- The null hypothesis is that there is no significant relationship between Eau Claire student attendance and Population normalized by distance for Eau Claire. The alternative hypothesis is that there is a significant relationship between Eau Claire student attendance and Population normalized by distance for Eau Claire.
Eau Claire student attendance and percentage of BS degrees.
- The null hypothesis is that there is no significant relationship between Eau Claire student attendance and percentage of BS degrees. The alternative hypothesis is that there is a significant relationship between Eau Claire student attendance and percentage of BS degrees.
Eau Claire student attendance and median household income.
- The null hypothesis is that there is no significant relationship between Eau Claire student attendance and median household income. The alternative hypothesis is that there is a significant relationship between Eau Claire student attendance and median household income.
Madison student attendance and Population normalized by distance for Madison.
- The null hypothesis is that there is no significant relationship between Madison student attendance and Population normalized by distance for Madison. The alternative hypothesis is that there is a significant relationship between Madison student attendance and Population normalized by distance for Madison.
- The null hypothesis is that there is no significant relationship between Madison student attendance and percentage of BS degrees. The alternative hypothesis is that there is a significant relationship between Madison student attendance and percentage of BS degrees.
Madison student attendance and median household income.
- The null hypothesis is that there is no significant relationship between Madison student attendance and median household income. The alternative hypothesis is that there is a significant relationship between Madison student attendance and median household income.
Results
I will map the residuals for just those variables that were found to be significant and the above regression analyses show that all but one of the variables was found to be significant. To get the residuals I just needed to save the standardized residuals before performing the regression analysis and export the tables of residuals to ArcGIS.
Eau Claire student attendance and Population normalized by distance for Eau Claire. R2=.753
This variable had a significance value of .000 and shows a pattern of counties with large population centers having large attendance at UW-Eau Claire. (fig. 8) |
Eau Claire student attendance and percentage of BS degrees. R2=.121
This variable had a significance value of .003 and shows a pattern that counties with high percentages of BS degrees have large attendance at UW-Eau Claire. (fig. 9) |
This variable had a significance value of .000 and shows a pattern of the largely populated Milwaukee area as having large attendance at UW-Madison. (fig. 10) |
This variable had a significance value of .000 and shows a pattern that counties with high percentages of BS degrees have large attendance at UW-Madison. (fig. 11) |
Discussion & Conclusion
The results from the residuals for Eau Claire show that the two variables of population normalized by distance to Eau Claire and percentage of BS degrees have a correlation with the students that attend UW-Eau Claire in each county. This means that students that go to UW-Eau Claire are more likely to come from populated areas and areas that have a lot of people with BS degrees. UW-Eau Claire students likely come from populated and educated areas across the state. Median household income does not have a significant relationship with student attendance at UW-Eau Claire meaning that income does not significantly influence students to attend UW-Eau Claire. The R2 value for the percentage of BS degrees is low however and this means that the BS degree variable has a weak relationship with student attendance at UW-Eau Claire. The population normalized by distance to Eau Claire variable has a high R2 and a very strong relationship with student attendance at UW-Eau Claire meaning this is the most influential predictor I looked at for why students choose to go to UW-Eau Claire.
The results from the residuals for Madison shows that all three of the variables of population normalized by distance to Eau Claire, percentage of BS degrees and median household income all have a correlation with the students that attend UW-Madison in each county. This means that students that go to UW-Madison are more likely to come from populated areas, areas that have a lot of people with BS degrees and areas with high median household income. UW-Madison students likely come from populated, educated and rich areas across the state. The R2 value for percentage of BS degrees and median household income have low R2 values meaning that they have weak relationships with student attendance at UW-Madison. The population normalized by distance to Madison variable has a very high R2 and a very strong relationship with student attendance at UW-Madison meaning this is the most influential predictor I looked at for why students choose to go to UW-Madison.