Tuesday, April 7, 2015

Assignment 4 - Correlation & Spatial Autocorrelation

Part 1: Correlation

Scatter-plot Results with Trend-line
Pearson Correlation Results in SPSS
1. The null hypothesis is that there is no relationship between distance and sound level. The alternative hypothesis is that there is a relationship between distance and sound level. This Pearson Correlation shows a strong negative correlation between distance and sound level. Therefore we will reject the null hypothesis.

Results of Correlation Matrix in SPSS

2. This matrix tells me certain strong correlations between variables. Some examples include: Percent Black has a very strong negative correlation with Bachelors Degree and a strong positive correlation with Below Poverty while Percent Hispanic has a very strong positive correlation with no high school Diplomas. There seems to be a pattern in non-white percentages that don't have high school diplomas or bachelor degrees, walk to work and are below the poverty level. Meanwhile white percentages are less likely to walk to work, be under the poverty level and to not have a high school diploma and/or bachelors degree. 

Part 2: Spatial Autocorrelation

Introduction

This exercise will be looking a 1980 and 2008 Presidential Elections in Texas and looking for patterns in the voting for the Texas Election Commission (TEC). We will comparing the results to Hispanic Population as well to see if there is a relationship between the that and voting turnouts. The TEC wants to see if there is clustering in voting patterns in the state as well as if there as been a change in the past 30 years.

Methods

The first process to this is to download the information I need to analysis this data. I downloaded the Hispanic 2010 Population Percentage for the counties of Texas from the U.S. Census website. The Voting data was provided by the TEC. I downloaded a shapefile of the counties in Texas also on the U.S. Census website. Next, in ArcGIS I performed a join between the Texas counties shapefile and the 2010 Census data as well as the voting data provided by TEC until I had all three datasets in one shapefile. To create the shapefile after I joined these tables to the counties I simply export the data into a shapefile.

Once I have my shapefile I can open it in GeoDa to perform some spatial auto-correlation analysis. First, I need to create a .gal file from the shapefile by going to Tools--Weights--Create and clicking on Add ID Variable, selecting Poly-ID and Rook Continuity under contiguity weight. Make sure to save the file as a .gal file. This will be used in the analysis later.

The first analysis we are going to do is a Moran's I. To do this just click on the Univariate Moran button, select the variable you want to analyze and select the .gal file created before. I repeated this for all five variables. Next, to create a LISA Cluster Map you just need to click the Univariate LISA, select the variable, select the .gal file from earlier and check The Cluster Map. Below I have all of the Moran's I and LISA Cluster Maps for each of the five variables.

Results


Moran's I and LISA Cluster Map for the Voter Turnout 1980 variable
The voter turnout in 1980 had some large clustering going on. The northern counties had a cluster of very high voter turnout while the southern and eastern counties had clustering of low voter turnout. The variable has a medium strong positive correlation.


Moran's I and LISA Cluster Map for the Percent Democratic Vote 1980 variable
The percentage of democratic votes in 1980 has even larger clustering going on. The northwestern counties have clusters of low democratic votes while the southern and eastern countries have clusters of high democratic votes. There is a medium strong positive correlation to this variable.

Moran's I and LISA Cluster Map for the Voter Turnout 2008 variable
Voter turnout in 2008 has small clustering in the counties of Texas. There are clusters of high voter turnout in the northern, northeastern and central counties as well as a cluster of low voter turnout in the southern counties. This variable has a medium strong positive correlation.

Moran's I and LISA Cluster Map for the Percent Democratic Vote 2008 variable
The percent of democratic votes in Texas in 2008 has large clusters going on. There are clusters of low percentage of democratic votes in the northern, northwestern and central counties while there are clusters of high democratic votes in the southern and southwestern counties. This variable has a very strong positive correlation.

Moran's I and LISA Cluster Map for the Percent Hispanic Population variable
The percentage of Hispanic population in 2010 in Texas has very large clusters going on. There is a large cluster of low percentage of Hispanic population in the northeastern counties while there is large cluster of high percentage of Hispanic population in the southwestern counties. This variable has a very strong positive correlation.

Conclusion

There are patterns in the voting in Texas counties between 1980 and 2008 as well as in Hispanic Populations in 2010. In 1980 there is a pattern of high voter turnout for Republicans in the north and low voter turnout for Democrats in the south and east. In 2008 there is a similar pattern however the Democrats have moved from the east to the west with low voter turnout still occurring in the southern counties. A variable that could be affecting the Democratic vote from moving east to west could be the increase in Hispanic population in the west counties and the lack of Hispanic population in the eastern counties.

For the analysis for the TEC I would tell them that there is a clustering in the state voting patterns with Democrats in the south and Republicans in the north. There has been a change in voting patterns over the last 30 years; the Democratic vote has moved from the east to the west and it seems to be possibly related to the Hispanic population in the western counties.

No comments:

Post a Comment