Friday, March 13, 2015

Assignment 3 - Z-tests and T-tests

Part I

1. Fill in the chart:

Interval Type
Confidence Level
N
a
Z or t?
Z or t value
A
2
90
45
0.05
Z
+/- 1.64
B
2
95
12
0.025
T
+/- 2.179
C
1
95
36
0.05
Z
1.64
D
2
99
180
0.005
Z
+/- 2.57
E
1
80
60
0.2
Z
0.84
F
1
99
23
0.01
T
2.5
G
2
99
15
0.005
T
+/- 2.947

2. The null hypothesis is that there is no significant difference between the sample mean and the estimated mean. The alternative hypothesis is that there is a significant difference between the sample mean and the estimated mean. I performed a z-test for each of them because the sample size was larger than 30. I had a 95% confidence level and a 2-tailed model giving me a level of significance of 0.025. For each of the invasive species the conclusion was the same: we reject the null hypothesis. The z-test had to stay within -1.96 and +1.96 and for each of the three they were all outside the range. From these conclusions I can ascertain that the Asian-Long Horned Beetle is not found in this county as much as in the rest of the state and the Emerald Ash Borer Beetle and Golden Nematode are more common in Buck County than the rest of the state.


z-test score
Asian-Long Horned Beetle
-7.749
Emerald Ash Borer Beetle
9.247
Golden Nematode
2.477

3. The null hypothesis is that the number of people per party has not changed in the intervening years between 1960 and 1985. The alternative hypothesis is that the number of people per party has changed in the intervening years between 1960 and 1985. I used a t-test since the sample size was smaller than 30. I had a 95% confidence level and a one-tailed test giving me a level of significance of 0.05. The conclusion was: We can reject the null hypothesis. The test had to stay below 1.708 and the t-test score was 4.924. The corresponding probability value that the null hypothesis is true is 0.000%.

Part II

Introduction

The second part of this assignment was a write-up that was based on Wisconsin research regarding the concept of "Up-North". The Tourism Board of Wisconsin wants me to find the variables that separate the "Up-North" from the bottom half of Wisconsin. Then I need to perform a chi-squared test on three variables and comparing the northern counties to the southern counties to see if those variables are what separate the north from the south.

Methods 

To achieve this I first need to download a shapefile from the U.S. Census website. I choose the counties within the state of Wisconsin and downloaded the files. To determine what counties were in the north I separated the counties between Hwy. 29. The counties above Hwy. 29 I gave a value of 1 in a newly made field and the counties below I gave a value of 2 (see figure 1).

Map showing the "Up North" counties separate from the "Down South" counties. (fig. 1)

I then joined the counties feature class to a table that included a bunch of information on Wisconsin counties including things that could be found more in "Up-North" counties. The fields that I choose to look at were deer licences, population and wilderness. For the fields of wilderness and deer licences the larger numbers were excepted in the north and for population lower numbers were excepted in the north. I ranked the counties from 1-4 for each county with hypothesized "Up-North" values with the higher numbers. The following maps show the three variables each with four rankings. (see figures 2-4).

Map of the number of deer gun licences in Wisconsin counties. (fig. 2)

Map of the population in Wisconsin populations. (fig. 3)
  
Map of the number of acres of wilderness in Wisconsin counties. (fig. 4)
I then exported the table with the fields of population, wilderness and deer gun licences included in order to analyze the chi-squared tests to see if there is a correlation with these variables and being in "Up-North". We used the program IBM SPSS Statistics 19 to do the chi-squared tests on the data. The chi-squared tests for each of the variables is shown below (see figures 5-7). I will discuss the meaning of these variables and the chi-squared test outcomes in the discussion and section.

Chi-squared test for the variable of deer gun licences. (fig. 5)
Chi-squared test for the variable of wilderness acreage. (fig. 6)
Chi-squared test for the variable of population. (fig. 7)

Results & Discussion


I chose the variable of population because I thought that less people would be living in the north. I chose the variable of deer gun licences because I thought that more people would have them in the north. Lastly, I chose the variable of wilderness because I thought that the most wilderness acreage would be in the north.

The null hypothesis is that there is no difference between the expected and observed frequencies of each variable occurring in the north; meaning that the variables are just as likely to occur in the north than south. The alternative hypothesis is that there is a different between the expected and observed frequencies of each variable occurring in the north; meaning that the variables are more likely to occur in the north than the south. For the variables of wilderness and deer gun licences we fail to reject the null hypothesis because the chi-squared test value is smaller than the CV for both of them. For the variable of population we reject the null hypothesis because the chi-squared test value is larger than the CV.

Conclusion

The maps and chi-squared tests show that the variables of deer gun licences and wilderness do not show signs that they are more frequent in the north, however they do show that there is a correlation between low population and living in the north. I do not think that the first two variables are good at explaining what it is that makes the north what it is. However, low population is a variable that could be a factor in the definition of what "Up-North" really means.

Sources:
All data provided by Prof. Weichelt.