AppleOrAndroid | College |
---|---|
Apple | Business |
Apple | CFAC |
Apple | Business |
Apple | CMS |
Apple | Engineering |
Research Question: Is there a relationship between which college someone is in and whether they use an apple or android phone?
Population: All BYU students.
Parameter of Interest:
Sample: A convenience sample of 1727 BYU students who are in my class and completed the student survey.
Are there any issues with this study setup?
Response Variable (y):
Explanatory Variable (x):
Main goal: Examine the RELATIONSHIP between College and Phone.
AppleOrAndroid | College |
---|---|
Apple | Business |
Apple | CFAC |
Apple | Business |
Apple | CMS |
Apple | Engineering |
Apple | Android | Sum | |
---|---|---|---|
Business | 557 | 51 | 608 |
CFAC | 119 | 19 | 138 |
CMS | 140 | 37 | 177 |
Education | 95 | 16 | 111 |
Engineering | 84 | 19 | 103 |
FHSS | 91 | 16 | 107 |
Humanities | 23 | 8 | 31 |
Life | 329 | 39 | 368 |
Nursing | 77 | 7 | 84 |
Sum | 1515 | 212 | 1727 |
Main Idea: Convert counts to proportions to account for differences in count sizes
Conditional Distribution of Row Variable given Column Variable:
Conditional Distribution of Column Variable given Row Variable:
Marginal Distribution of Column (or Row)
Relationship between variables is probably present if conditionals are different than marginal distributions.
Apple | Android | |
---|---|---|
Business | 0.916 | 0.084 |
CFAC | 0.862 | 0.138 |
CMS | 0.791 | 0.209 |
Education | 0.856 | 0.144 |
Engineering | 0.816 | 0.184 |
FHSS | 0.850 | 0.150 |
Humanities | 0.742 | 0.258 |
Life | 0.894 | 0.106 |
Nursing | 0.917 | 0.083 |
Margin (Overall) | 0.877 | 0.123 |
Apple | Android | Margin (Overall) | |
---|---|---|---|
Business | 0.368 | 0.241 | 0.352 |
CFAC | 0.079 | 0.090 | 0.080 |
CMS | 0.092 | 0.175 | 0.102 |
Education | 0.063 | 0.075 | 0.064 |
Engineering | 0.055 | 0.090 | 0.060 |
FHSS | 0.060 | 0.075 | 0.062 |
Humanities | 0.015 | 0.038 | 0.018 |
Life | 0.217 | 0.184 | 0.213 |
Nursing | 0.051 | 0.033 | 0.049 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
Of those cell phone distracted drivers, what proportion are 15-19?
Is this a conditional or marginal proportion?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of age for those who are cell phone distracted?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of age for those who are cell phone distracted?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 0.127 | 0.349 | 0.211 | 0.151 | 0.105 | 0.046 | 0.011 | 1 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of age for those who are cell phone distracted?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 0.127 | 0.349 | 0.211 | 0.151 | 0.105 | 0.046 | 0.011 | 1 |
Not Distracted | 0.063 | 0.240 | 0.178 | 0.156 | 0.159 | 0.112 | 0.092 | 1 |
Other Distracted | 0.096 | 0.283 | 0.185 | 0.126 | 0.131 | 0.091 | 0.089 | 1 |
Margin (Overall) | 0.066 | 0.243 | 0.179 | 0.154 | 0.157 | 0.110 | 0.091 | 1 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of distracted for those aged 20-29?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of distracted for those aged 20-29?
20-29 | |
---|---|
Cell Phone Distracted | 0.013 |
Not Distracted | 0.914 |
Other Distracted | 0.073 |
Sum | 1.000 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
What is the conditional distribution of distracted for those aged 20-29?
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Margin (Overall) | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 0.017 | 0.013 | 0.011 | 0.009 | 0.006 | 0.004 | 0.001 | 0.009 |
Not Distracted | 0.891 | 0.914 | 0.925 | 0.940 | 0.942 | 0.945 | 0.938 | 0.928 |
Other Distracted | 0.091 | 0.073 | 0.065 | 0.051 | 0.052 | 0.052 | 0.061 | 0.063 |
Of those drivers aged 30-39, what proportion are not distracted?
Of those “other distracted” drivers, what proportion are 60-69?
The independence population model: The choice of apple vs. android product for a student is independent of the college of the student. In other words, the two variables are independent of each other.
Apple | Android | Sum | |
---|---|---|---|
Business | 557 | 51 | 608 |
CFAC | 119 | 19 | 138 |
CMS | 140 | 37 | 177 |
Education | 95 | 16 | 111 |
Engineering | 84 | 19 | 103 |
FHSS | 91 | 16 | 107 |
Humanities | 23 | 8 | 31 |
Life | 329 | 39 | 368 |
Nursing | 77 | 7 | 84 |
Sum | 1515 | 212 | 1727 |
Because of independence… \[ \begin{align*} \text{Pr}(\text{Apple & Business}) &= \text{Pr}(\text{Apple})\text{Pr}(\text{Business}) \\ &= (1515/ 1727) \times (608/ 1727) \\ &= 0.309 \end{align*} \]
IF variables are independent, expected number of people in each cell: \[ \begin{align*} \text{Exp. No. of Apple/Business} &= n\times \text{Pr}(\text{Apple})\text{Pr}\times (\text{Business}) \\ &= 1727 \times 0.309 \\ &= 533.364 \end{align*} \]
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
15-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70+ | Sum | |
---|---|---|---|---|---|---|---|---|
Cell Phone Distracted | 58 | 159 | 96 | 69 | 48 | 21 | 5 | 456 |
Not Distracted | 2962 | 11278 | 8382 | 7328 | 7482 | 5282 | 4341 | 47055 |
Other Distracted | 303 | 898 | 586 | 400 | 415 | 288 | 282 | 3172 |
Sum | 3323 | 12335 | 9064 | 7797 | 7945 | 5591 | 4628 | 50683 |
Good news! The tool will calculate the expected counts for you. You just need to know where to look…
Research Question: Is there a relationship between which college someone is in and whether they use an apple or android phone?
The hypotheses: \[ \begin{align} H_0: & \text{ College and Phone are Independent}\\ H_a: & \text{ College and Phone are NOT Independent}\\ \end{align} \]
Step 2: See if the data matches the hypotheses.
How can we compare our observed data to hypotheses?
Step 2: See if the data matches the hypotheses.
How can we compare our observed data to hypotheses?
Step 2: See if the data matches the hypotheses.
How can we compare our observed data to hypotheses?
The \(\chi^2\)-statistic: (pronounced “kai-squared”) \[ \begin{align*} \chi^2 &= \sum_{r=1}^R \sum_{c=1}^C \chi^2_{rc} \\ &= \sum_{r=1}^R \sum_{c=1}^C \frac{(\text{Obs}_{rc} - \text{Exp}_{rc})^2}{\text{Exp}_{rc}} \end{align*} \]
Step 2: See if the data matches the hypotheses. \[ \begin{align*} \chi^2 &= \sum_{r=1}^R \sum_{c=1}^C \frac{(\text{Obs}_{rc} - \text{Exp}_{rc})^2}{\text{Exp}_{rc}} \end{align*} \]
Intuition
Step 2: See if the data matches the hypotheses.
If the independence model is appropriate AND all expected counts are \(>\) 5, then the \(\chi^2\) values that you should get when sampling follows an \(\chi^2\)-distribution.
Step 2: See if the data matches the hypotheses (FIRST - check to make sure all expected counts \(> 5\))
Apple | Android | |
---|---|---|
Business | 533.364 | 74.636 |
CFAC | 121.060 | 16.940 |
CMS | 155.272 | 21.728 |
Education | 97.374 | 13.626 |
Engineering | 90.356 | 12.644 |
FHSS | 93.865 | 13.135 |
Humanities | 27.195 | 3.805 |
Life | 322.826 | 45.174 |
Nursing | 73.688 | 10.312 |
Step 2: See if the data matches the hypotheses.
What is your conclusion at the \(\alpha = 0.05\) level?
The tool calculates the \(\chi^2_{rc}\) values for you:
Are the conditions met to perform a \(\chi^2\) test for the hypotheses: \[ \begin{align} H_0&: \text{There is NO association between distraction and age} \\ H_A&: \text{There is an association between distraction and age} \end{align} \]
Are the conditions met to perform a \(\chi^2\) test for the hypotheses: \[ \begin{align} H_0&: \text{There is NO association between distraction and age} \\ H_A&: \text{There is an association between distraction and age} \end{align} \]
What is the conclusion for the test: \[ \begin{align} H_0&: \text{There is NO association between distraction and age} \\ H_A&: \text{There is an association between distraction and age} \end{align} \]
What is the conclusion for the test: \[ \begin{align} H_0&: \text{There is NO association between distraction and age} \\ H_A&: \text{There is an association between distraction and age} \end{align} \]
IF you reject \(H_0\), what can we say about where the relationship is? In other words, where are observed counts most different from expected counts?
Apple | Android | |
---|---|---|
Business | 1.0 | 7.5 |
CFAC | 0.0 | 0.3 |
CMS | 1.5 | 10.7 |
Education | 0.1 | 0.4 |
Engineering | 0.4 | 3.2 |
FHSS | 0.1 | 0.6 |
Humanities | 0.6 | 4.6 |
Life | 0.1 | 0.8 |
Nursing | 0.1 | 1.1 |
Apple | Android | |
---|---|---|
Business | 557 | 51 |
CFAC | 119 | 19 |
CMS | 140 | 37 |
Education | 95 | 16 |
Engineering | 84 | 19 |
FHSS | 91 | 16 |
Humanities | 23 | 8 |
Life | 329 | 39 |
Nursing | 77 | 7 |
Apple | Android | |
---|---|---|
Business | 533.4 | 74.6 |
CFAC | 121.1 | 16.9 |
CMS | 155.3 | 21.7 |
Education | 97.4 | 13.6 |
Engineering | 90.4 | 12.6 |
FHSS | 93.9 | 13.1 |
Humanities | 27.2 | 3.8 |
Life | 322.8 | 45.2 |
Nursing | 73.7 | 10.3 |
What is the largest contributor to the relationship between distraction and age in the distracted driving analysis?
What is the largest contributor to the relationship between distraction and age in the distracted driving analysis?