Data Collection via Experiments

2 Study Designs

Observational Studies
- Sample a group of people and watch/observe their behavior

Experiments
- Recruit a group of people and assign them a treatment

YPoll 1.2 Question 1

A group of researchers wanted to understand the effect of “screen time” before sleep. Classify the study design as either an observational study or experiment.

Study A took a group of adults and randomly divided them into two groups. One group was told to not watch screens for the 1 hour preceding going to bed while the other group was were allowed screen time up until going to bed. Researchers then compared how long it took each group fell asleep.
Study B asked a group of adults if they were using screens in the hour before going to bed. Researchers then compared how long it took each group fell asleep.

YPoll 1.2 Question 1 Answer

A group of researchers wanted to understand the effect of “screen time” before sleep. Classify the study design as either an observational study or experiment.

Experiment: Study A took a group of adults and randomly divided them into two groups. One group was told to not watch screens for the 1 hour preceding going to bed while the other group was were allowed screen time up until going to bed. Researchers then compared how long it took each group fell asleep.
Obs. Study: Study B asked a group of adults if they were using screens in the hour before going to bed. Researchers then compared how long it took each group fell asleep.

Why Experiment?

A well-designed experiment can give evidence that the treatment causes the response by controlling for lurking variables. Lurking variables are variables in your study that you have not controlled for but may affect the outcome.

Experiments try to control for as many lurking variables as possible.

Basic Structure of Experiments

Basic Terminology for Experiments

Subject: Individual on which we are going to measure a variable
Response variable (y): outcome of the experiment (e.g. damage)
Explanatory variable (x): variable used to explain the response (e.g. number of firefighters)
- Factor: An explanatory variable with a fixed number of values (e.g. 2, 5 and 10 firefighters)
Treatment: the condition or conditions applied to a subject or individual in an experiment (e.g. 5 fire fighters)

Basic Terminology Cont’d

Control: a “treatment” with, supposedly, zero effect
- Placebo: a fake treatment level to account for psychological effects
Double blind study: An experiment where the individual and researcher don’t know which treatment is applied.
Confounding: A situation where a lurking variable, in addition to the explanatory variable, is affecting the response

An example: COVID Vaccine Trial

Fill in the following for the Pfizer COVID vaccine study:

Subject:
Response variable:
Explanatory variable:
Factor:
Treatment:
Control:
Placebo:
Double blind study:
Lurking variables:
Confounded:

An example: COVID Vaccine Trial

Fill in the following for the Pfizer COVID vaccine study:

Subject: A person (anyone who showed up as part of the trial)
Response variable: Whether or not a person got COVID
Explanatory variable: Whether or not a person got the vaccine
Factor: Yes (only two levels for explanatory variable: vaccine or placebo shot)
Treatment: Vaccine or placebo shot
Control: Placebo shot
Placebo: Yes (placebo shot)
Double blind study: Yes
Lurking variables: Contact with other people, living conditions, health condition, age, etc.
Confounded: Difficult to say, but there is no clear connection between the explanatory variable and any of the lurking variables

Principles of Valid Experiments

Control/Comparison: control lurking variables by including comparison treatments, using homogeneous subjects; used to measure placebo effect
Randomization: neutralize effects of lurking variables by randomly assigning subjects to treatments
Replication: assign more than one subject to each treatment group
Double blinding (if possible)

Returning to COVID Vaccine Trial

Did the vaccine study include each of the following? If so, how?

Control/Comparison
Randomization
Replication
Double blinding

How would you design a good experiment?

Does requiring someone to sign up for an account with your company increase or decrease purchases?

Design a valid experiment to answer this question.

Account Requirement Experiment

How did you include each of the following?

Control/Comparison
Randomization
Replication
Double blinding

Note: this is a common type of experiment in data science called “A/B” testing

Main classes of good experiments

Randomized controlled experiment
Randomized block experiment
- Matched pairs as a special case

Randomized controlled experiments

Randomly split all subjects into treatment groups.

Randomized controlled experiments

Design Principles:

Comparison/control: multiple treatments
Randomization: used to split into groups
Replication: more than 1 rat per group
Double blind: maybe

What can go wrong with RCEs?

Hypothetical experiment with rats and a 5 factor explanatory variables (A, B, C, D and E)

What can go wrong with RCEs?

Randomization outcomes:

Randomized Block Experiment

Solution: separate rats by size first then randomize within each size (block = group).

Do I still have all the principles of good experimental design?

RBEs vs RCEs

Randomized Controlled Experiment (RCE):

Use when individuals are similar

Randomized Block Experiments (RBE):

Use when the individuals are similar within a block but very different from block to block
RBE removes confounding of lurking variables with response variable
RBE reduces chance variation by removing variation associated with the lurking (blocking) variable.
RBE yields more precise estimates of chance variation which makes detection of statistical significance easier

Matched Pairs Studies

Explanatory variable: 2 level factor
Block: 2 subjects who are very similar (e.g. twins, same person)
Randomly assign 1 subject within each block to treatment

Matched Pairs Example

Example: A manufacturer of boots plans to conduct an experiment to compare a new method of waterproofing to the current method. The appearance of the boots is not changed by either waterproofing method. The company recruits 100 volunteers in Seattle (where it rains a lot) to wear the boots as they normally would for 6 months. At the end of the 6 months, the boots will be returned to the company to be evaluated for water damage.

Understanding Check

240 subjects are available for an experiment testing the effects of different diets. Software randomly assigns 60 subjects to Diet 1, 60 subjects to Diet 2, 60 subjects to Diet 3, and 60 subjects to Diet 4. What type of study is this?

a randomized controlled experiment
a randomized block design, with four blocks
a matched pairs design
an observational study
none of the above

Understanding Check

240 subjects are available for an experiment testing the effects of different diets. Software randomly assigns 60 subjects to Diet 1, 60 subjects to Diet 2, 60 subjects to Diet 3, and 60 subjects to Diet 4. What type of study is this?

a randomized controlled experiment
a randomized block design, with four blocks
a matched pairs design
an observational study
none of the above

YPoll 1.2 Question 2

Do cars get better gas mileage with clean air filters? Gas mileage for 10 cars with dirty air filters and clean air filters was studied. Each car was tested once with a clean air filter and once with a dirty air filter (with the order of the testing randomized). What type of study is this?

an observational study based on a simple random sample
an observational study based on a stratified random sample
an observational study based on a multistage random sample
a randomized controlled experiment
a matched pairs experiment

YPoll 1.2 Question 2 Answer

Do cars get better gas mileage with clean air filters? Gas mileage for 10 cars with dirty air filters and clean air filters was studied. Each car was tested once with a clean air filter and once with a dirty air filter (with the order of the testing randomized). What type of study is this?

an observational study based on a simple random sample
an observational study based on a stratified random sample
an observational study based on a multistage random sample
a randomized controlled experiment
a matched pairs experiment

Biases in Experiments

Constructing a valid experiment is only half the battle. We need to be careful about a few things…

Placebo Effect

Problem: The placebo effect is response by human subjects due to the psychological effect of being treated.
Solution: Use a placebo

Diagnostic Effect

Problem: Diagnosis of subjects biased by preconceived notions about effectiveness of treatment
Solution: Blind the diagnoser

Lack of Realism

Problem: Sometimes experiments can’t apply to real life.
Solution: Keep it real! https://kids.frontiersin.org/articles/10.3389/frym.2017.00013

Hawthorne Effect

Problem: people in an experiment behave differently from how they would normally behave
Solution: Hidden observation (if ethical)

https://www.vitalacy.com/post/hawthorne-effect-hand-hygiene

Non-compliance

Problem: People don’t do what they are supposed to do
Solution: You can’t make them

YPoll 1.2 Question 3

A study was done to determine if various changes in work conditions would have an effect on how productive workers were during their shifts. In the study, a manager told the employees that he would be studying this and proceeded to introduce more frequent breaks, altered lighting conditions, or reorganized the workspace. To collect data on productivity, new cameras were installed to monitor the employees. The researchers found that almost any change to the workplace led to increases in productivity. This is an example of:

Placebo Effect
Interviewer Effect
Hawthorne Effect
Non-compliance
Diagnostic Effect

YPoll 1.2 Question 3 Answer

A study was done to determine if various changes in work conditions would have an effect on how productive workers were during their shifts. In the study, a manager told the employees that he would be studying this and proceeded to introduce more frequent breaks, altered lighting conditions, or reorganized the workspace. To collect data on productivity, new cameras were installed to monitor the employees. The researchers found that almost any change to the workplace led to increases in productivity. This is an example of:

Placebo Effect
Interviewer Effect
Hawthorne Effect
Non-compliance
Diagnostic Effect

Key Terminology

Causation
Lurking Variables
Subject
Response variable
Explanatory variable / factor
Treatment
Control
Placebo
Double blind

Placebo effect
Diagnostic Bias
Data Ethics
Randomized Controlled Experiment
Randomized Block Experiment
Matched Pairs Experiment