Analysis of Matched Pairs Experiments
An Experiment
Methylphenidate is FDA-approved for treating attention deficit hyperactivity disorder (ADHD) in children and adults and as a second-line treatment for narcolepsy in adults (more info here). In one experiment investigating the impact of methylphenidate on cognitive function, subjects were asked to complete a delay of gratification (DOG) task both on a dose of the drug and not on the drug (the order being randomized). Children were told that a star would appear on the computer screen if they waited “long enough” to press a response key. If a child responded in less than four seconds after their previous response, they did not earn a star, and the 4-second counter restarted. The DOG differentiates children with and without ADHD. On RShiny, the dataset is called “delay of gratification.”
What type of experiment is this?
- A matched pairs experiment
- If interested, you can read more about the study here.
- You can download the original data from the experiment here.
The Data from the Experiment
1 |
57 |
62 |
5 |
2 |
27 |
49 |
22 |
3 |
32 |
30 |
-2 |
4 |
31 |
34 |
3 |
5 |
34 |
38 |
4 |
Variables:
-
Patient = Patient Number
-
D0 = Number correct on 0 Dose
-
D60 = Number correct on 60mg Dose
-
Diff = Difference of D60-D0
-
\(n=\) 24
Analyzing the Data
Steps of statistical analysis:
- Identify the population and parameter you are interested in.
- With matched pairs experiments, we analyze the difference column because we focus on the difference the treatment made in each patient.
Practice 2.5 Question 1
Steps of statistical analysis:
- Identify the population and parameter you are interested in.
- Population: Children with ADHD
- Parameter: The mean difference in number correct on the drug minus the number correct not on the drug for all children with ADHD. We will use \(\mu_d\) to denote this difference.
We want to know if the drug helped. What should the null and alternative hypotheses be? \[
\begin{align}
H_0: \\
H_a:
\end{align}
\]
Practice 2.5 Question 1 Answer
Steps of statistical analysis:
-
Identify the population and parameter you are interested in.
-
Population: Children with ADHD
-
Parameter: The mean difference in number correct on the drug minus not on the drug for all children with ADHD. We will use \(\mu_d\) to denote this difference.
We want to know if the drug helped. What should the null and alternative hypotheses be? \[
\begin{align}
H_0: \mu_d = 0\\
H_a: \mu_d > 0
\end{align}
\]
Analyzing the Data
Steps of statistical analysis:
-
Identify the population and parameter you are interested in.
-
Population: Children with ADHD
-
Parameter: The mean difference in number correct on the drug minus not on the drug for all children with ADHD. We will use \(\mu_d\) to denote this difference.
-
Collect data - did the experiment use all 3 principles of good experimental design (randomization, replication and control)?
Analyzing the Data
Steps of statistical analysis:
-
Identify the population and parameter you are interested in.
-
Population: Children with ADHD
-
Parameter: The mean difference in number correct on the drug minus not on the drug for all children with ADHD. We will use \(\mu_d\) to denote this difference.
-
Collect data - did the experiment use all 3 principles of good experimental design (randomization, replication and control)?
-
Yes - replication (more than 1 subject), randomization (randomized the order) and control (zero dose).
-
Posit a Statistical Model - how do we this?
- Exploratory Data Analysis
EDA
You want to explore the differences:
Mean: 4.958 SD: 7.538 Skew: 0.226 Median: 5
Analyzing the Data
Steps of statistical analysis:
-
Posit a Statistical Model \[ Y \sim N(\mu_d, \sigma_d) \]
How do we interpret \(\mu_d\) and \(\sigma_d\)?
- \(\mu_d\): the average difference between the number correct on the drug minus the number correct not on the drug for all ADHD children
- \(\sigma_d\) the standard deviation of the differences
Analyzing the Data
-
Draw inference about the population using your model.
What are our point estimates for \(\mu_d\) and \(\sigma_d\)?
- \(\bar{y} =\) 4.958
- \(s =\) 7.538
- Does \(\bar{y} = \mu_d\) and \(s = \sigma_d\)?
- No! Our analysis would be better if we had a conclusion about \(\mu_d\) rather than simply \(\bar{y}\).
- Lets carry out a hypothesis test!
\(t\)-test
- Step 1 - Write out the Hypothesis (we did this earlier)
\[
\begin{align}
H_0: \mu_d = 0\\
H_a: \mu_d > 0
\end{align}
\]
\(t\)-test
- Step 2 - See if our data matches (or doesn’t match) the null hypothesis
- First, is the t-distribution appropriate to use here?
Practice 2.5 Question 2
What is the \(t\) statistic for this particular problem?
Practice 2.5 Question 2 Answer
What is the \(t\) statistic for this particular problem?
Practice 2.5 Question 3
What is our conclusion using \(\alpha = 0.01\)?
-
Fail to reject \(H_0\) and conclude that the drug does NOT help.
-
Fail to reject \(H_0\) and conclude that the drug helps.
-
Reject \(H_0\) and conclude that the drug does NOT help.
-
Reject \(H_0\) and conclude that the drug helps.
Practice 2.5 Question 3 Answer
What is our conclusion using \(\alpha = 0.01\)?
-
Fail to reject \(H_0\) and conclude that the drug does NOT help.
-
Fail to reject \(H_0\) and conclude that the drug helps.
-
Reject \(H_0\) and conclude that the drug does NOT help.
-
Reject \(H_0\) and conclude that the drug helps.
Confidence Interval
The drug helps but how much of a difference does the drug make (on average)? A 95% confidence interval for \(\mu_d\) is \[\bar{y} \pm t^\star \frac{s}{\sqrt{n}}\] which is (1.775, 8.141). How do we interpret this interval?
- We are 95% confident that ADHD kids, on average, on a 60mg dose of Methylphenidate get between 1.775 and 8.141 more correct on the DOG task than when not on the drug.
- On the basis of this interval alone, can we say that \(\mu_d = 0\)?
- No because 0 is not in the interval.
Further Considerations in the Analysis
Given a 95% confidence interval of (1.775, 8.141), are these results statistically significant?
- Yes because the 95% interval does NOT contain 0.
Given a 95% confidence interval of (1.775, 8.141), are these results practically significant?
What would be a Type 1 error for this analysis?
- Saying the drug has an effect when, in fact, it doesn’t do anything.
What would be a Type 2 error for this analysis?
- Saying the drug doesn’t do anything when, in fact, it does.
Further Considerations in the Analysis
Given the two error types, do you think a Type 1 or a Type 2 error is worse?
- Debatable but I’d go with a Type 1.
Assuming a Type 1 error is worse, what do you think we should set \(\alpha\) at (larger or smaller)?
- Smaller because we set \(\alpha\) smaller if Type 1 error is worse.
Practice 2.5 Question 4
Admittedly, (1.775, 8.141) is a pretty wide interval. What could we do if we want a tighter interval? Check all that apply.
-
Increase the sample size
-
Decrease the sample size
-
Increase the confidence level
-
Decrease the confidence level
-
Nothing can be done; it is what it is.
Practice 2.5 Question 4 Answer
Admittedly, (1.775, 8.141) is a pretty wide interval. What could we do if we want a tighter interval? Check all that apply.
-
Increase the sample size
-
Decrease the sample size
-
Increase the confidence level
-
Decrease the confidence level
-
Nothing can be done; it is what it is.
Further Considerations in the Analysis
Using our point estimates as our best guess for the population parameters and assuming a normal population model, what is the probability that a student gets at least 10 more correct on the drug vs. not on the drug?
Using our point estimates as our best guess for the population parameters and assuming a normal population model, how many more correct do the top 1% of subjects get on the drug vs. not on the drug?
Practice 2.5 Question 5
A study was conducted to investigate the placebo effect on patients. Patients who suffer from arthritis were recruited for a study and were asked to rate their average daily arthritis pain on a scale of 0 to 10 (with 0 being no pain) before being given placebo and after being given a placebo and the difference was recorded (the difference was calculated as “before” minus “after” so that large positive differences mean that the placebo “helped”). Use the Placebo dataset on the course analysis app.
Is there a statistically significant placebo effect? In other words, does thinking you have a drug improve health? Use \(\alpha = 0.05\).
-
Yes because we reject the null hypothesis of zero placebo effect.
-
Yes because we fail to reject the null hypothesis of zero placebo effect.
-
No because we reject the null hypothesis of zero placebo effect.
-
No because we fail to reject the null hypothesis of zero placebo effect.
-
We don’t know because the sampling distribution of \(t\) is inappropriate to use for this example.
Practice 2.5 Question 5 Answer
A study was conducted to investigate the placebo effect on patients. Patients who suffer from arthritis were recruited for a study and were asked to rate their average daily arthritis pain on a scale of 0 to 10 (with 0 being no pain) before being given placebo and after being given a placebo and the difference was recorded (the difference was calculated as “before” minus “after” so that large positive differences mean that the placebo “helped”). Use the Placebo dataset on the course analysis app.
Is there a statistically significant placebo effect? In other words, does thinking you have a drug improve health? Use \(\alpha = 0.05\).
-
Yes because we reject the null hypothesis of zero placebo effect.
-
Yes because we fail to reject the null hypothesis of zero placebo effect.
-
No because we reject the null hypothesis of zero placebo effect.
-
No because we fail to reject the null hypothesis of zero placebo effect.
-
We don’t know because the sampling distribution of \(t\) is inappropriate to use for this example.
Practice
A study was conducted to investigate the placebo effect on patients. Patients who suffer from arthritis were recruited for a study and were asked to rate their average daily arthritis pain on a scale of 0 to 10 (with 0 being no pain) before being given placebo and after being given a placebo and the difference was recorded (the difference was calculated as “before” minus “after” so that large positive differences mean that the placebo “helped”). Use the Placebo dataset on the course analysis app to answer the following questions.
What is a 95% confidence interval for the difference in pain before and after the placebo? In other words, how big is the placebo effect?
- We are 95% confident that patients had between (2.98, 4.32) higher pain score NOT on the placebo than on the placebo. OR, we are 95% confident that the placebo reduces pain levels by between 2.98 and 4.32.
Key Terminology
- Matched pairs experiment
- Mean of the differences (\(\mu_d\))
- Standard deviation of the differences (\(\sigma_d\))