Data Collection via Experiments

2 Study Designs

  1. Observational Studies
    • Sample a group of people and watch/observe their behavior
  2. Experiments
    • Recruit a group of people and assign them a treatment

YPoll 1.2 Question 1

A group of researchers wanted to understand the effect of “screen time” before sleep. Classify the study design as either an observational study or experiment.

  1. Study A took a group of adults and randomly divided them into two groups. One group was told to not watch screens for the 1 hour preceding going to bed while the other group was were allowed screen time up until going to bed. Researchers then compared how long it took each group fell asleep.
  2. Study B asked a group of adults if they were using screens in the hour before going to bed. Researchers then compared how long it took each group fell asleep.

YPoll 1.2 Question 1 Answer

A group of researchers wanted to understand the effect of “screen time” before sleep. Classify the study design as either an observational study or experiment.

  1. Experiment: Study A took a group of adults and randomly divided them into two groups. One group was told to not watch screens for the 1 hour preceding going to bed while the other group was were allowed screen time up until going to bed. Researchers then compared how long it took each group fell asleep.
  2. Obs. Study: Study B asked a group of adults if they were using screens in the hour before going to bed. Researchers then compared how long it took each group fell asleep.

Why Experiment?

A well-designed experiment can give evidence that the treatment causes the response by controlling for lurking variables. Lurking variables are variables in your study that you have not controlled for but may affect the outcome.

Experiments try to control for as many lurking variables as possible.

Basic Structure of Experiments

Basic Terminology for Experiments

  • Subject: Individual on which we are going to measure a variable
  • Response variable (y): outcome of the experiment (e.g. damage)
  • Explanatory variable (x): variable used to explain the response (e.g. number of firefighters)
    • Factor: An explanatory variable with a fixed number of values (e.g. 2, 5 and 10 firefighters)
  • Treatment: the condition or conditions applied to a subject or individual in an experiment (e.g. 5 fire fighters)

Basic Terminology Cont’d

  • Control: a “treatment” with, supposedly, zero effect
    • Placebo: a fake treatment level to account for psychological effects
  • Double blind study: An experiment where the individual and researcher don’t know which treatment is applied.
  • Confounding: A situation where a lurking variable, in addition to the explanatory variable, is affecting the response

An example: COVID Vaccine Trial

An example: COVID Vaccine Trial

Fill in the following for the Pfizer COVID vaccine study:

  • Subject:
  • Response variable:
  • Explanatory variable:
  • Factor:
  • Treatment:
  • Control:
  • Placebo:
  • Double blind study:
  • Lurking variables:
  • Confounded:

An example: COVID Vaccine Trial

Fill in the following for the Pfizer COVID vaccine study:

  • Subject: A person (anyone who showed up as part of the trial)
  • Response variable: Whether or not a person got COVID
  • Explanatory variable: Whether or not a person got the vaccine
  • Factor: Yes (only two levels for explanatory variable: vaccine or placebo shot)
  • Treatment: Vaccine or placebo shot
  • Control: Placebo shot
  • Placebo: Yes (placebo shot)
  • Double blind study: Yes
  • Lurking variables: Contact with other people, living conditions, health condition, age, etc.
  • Confounded: Difficult to say, but there is no clear connection between the explanatory variable and any of the lurking variables

Principles of Valid Experiments

  1. Control/Comparison: control lurking variables by including comparison treatments, using homogeneous subjects; used to measure placebo effect
  2. Randomization: neutralize effects of lurking variables by randomly assigning subjects to treatments
  3. Replication: assign more than one subject to each treatment group
  4. Double blinding (if possible)

Returning to COVID Vaccine Trial

Did the vaccine study include each of the following? If so, how?

  1. Control/Comparison
  2. Randomization
  3. Replication
  4. Double blinding

How would you design a good experiment?

Does requiring someone to sign up for an account with your company increase or decrease purchases?

  • Design a valid experiment to answer this question.

Account Requirement Experiment

How did you include each of the following?

  1. Control/Comparison
  2. Randomization
  3. Replication
  4. Double blinding

Note: this is a common type of experiment in data science called “A/B” testing

Main classes of good experiments

  1. Randomized controlled experiment
  2. Randomized block experiment
    • Matched pairs as a special case

Randomized controlled experiments

Randomly split all subjects into treatment groups.

Randomized controlled experiments

Design Principles:

  1. Comparison/control: multiple treatments

  2. Randomization: used to split into groups

  3. Replication: more than 1 rat per group

  4. Double blind: maybe

What can go wrong with RCEs?

Hypothetical experiment with rats and a 5 factor explanatory variables (A, B, C, D and E)

What can go wrong with RCEs?

Randomization outcomes:

Randomized Block Experiment

Solution: separate rats by size first then randomize within each size (block = group).

Do I still have all the principles of good experimental design?

RBEs vs RCEs

Randomized Controlled Experiment (RCE):

  1. Use when individuals are similar

Randomized Block Experiments (RBE):

  1. Use when the individuals are similar within a block but very different from block to block

  2. RBE removes confounding of lurking variables with response variable

  3. RBE reduces chance variation by removing variation associated with the lurking (blocking) variable.

  4. RBE yields more precise estimates of chance variation which makes detection of statistical significance easier

Matched Pairs Studies

  • Explanatory variable: 2 level factor
  • Block: 2 subjects who are very similar (e.g. twins, same person)
  • Randomly assign 1 subject within each block to treatment

Matched Pairs Example

Example: A manufacturer of boots plans to conduct an experiment to compare a new method of waterproofing to the current method. The appearance of the boots is not changed by either waterproofing method. The company recruits 100 volunteers in Seattle (where it rains a lot) to wear the boots as they normally would for 6 months. At the end of the 6 months, the boots will be returned to the company to be evaluated for water damage.

Understanding Check

240 subjects are available for an experiment testing the effects of different diets. Software randomly assigns 60 subjects to Diet 1, 60 subjects to Diet 2, 60 subjects to Diet 3, and 60 subjects to Diet 4. What type of study is this?

  1. a randomized controlled experiment

  2. a randomized block design, with four blocks

  3. a matched pairs design

  4. an observational study

  5. none of the above

Understanding Check

240 subjects are available for an experiment testing the effects of different diets. Software randomly assigns 60 subjects to Diet 1, 60 subjects to Diet 2, 60 subjects to Diet 3, and 60 subjects to Diet 4. What type of study is this?

  1. a randomized controlled experiment

  2. a randomized block design, with four blocks

  3. a matched pairs design

  4. an observational study

  5. none of the above

YPoll 1.2 Question 2

Do cars get better gas mileage with clean air filters? Gas mileage for 10 cars with dirty air filters and clean air filters was studied. Each car was tested once with a clean air filter and once with a dirty air filter (with the order of the testing randomized). What type of study is this?

  1. an observational study based on a simple random sample

  2. an observational study based on a stratified random sample

  3. an observational study based on a multistage random sample

  4. a randomized controlled experiment

  5. a matched pairs experiment

YPoll 1.2 Question 2 Answer

Do cars get better gas mileage with clean air filters? Gas mileage for 10 cars with dirty air filters and clean air filters was studied. Each car was tested once with a clean air filter and once with a dirty air filter (with the order of the testing randomized). What type of study is this?

  1. an observational study based on a simple random sample

  2. an observational study based on a stratified random sample

  3. an observational study based on a multistage random sample

  4. a randomized controlled experiment

  5. a matched pairs experiment

Biases in Experiments

Constructing a valid experiment is only half the battle. We need to be careful about a few things…

Placebo Effect

  • Problem: The placebo effect is response by human subjects due to the psychological effect of being treated.

  • Solution: Use a placebo

Diagnostic Effect

  • Problem: Diagnosis of subjects biased by preconceived notions about effectiveness of treatment
  • Solution: Blind the diagnoser

Lack of Realism

  • Problem: Sometimes experiments can’t apply to real life.
  • Solution: Keep it real!
    https://kids.frontiersin.org/articles/10.3389/frym.2017.00013

Hawthorne Effect

  • Problem: people in an experiment behave differently from how they would normally behave

  • Solution: Hidden observation (if ethical)

    https://www.vitalacy.com/post/hawthorne-effect-hand-hygiene

Non-compliance

  • Problem: People don’t do what they are supposed to do
  • Solution: You can’t make them

YPoll 1.2 Question 3

A study was done to determine if various changes in work conditions would have an effect on how productive workers were during their shifts. In the study, a manager told the employees that he would be studying this and proceeded to introduce more frequent breaks, altered lighting conditions, or reorganized the workspace. To collect data on productivity, new cameras were installed to monitor the employees. The researchers found that almost any change to the workplace led to increases in productivity. This is an example of:

  • Placebo Effect
  • Interviewer Effect
  • Hawthorne Effect
  • Non-compliance
  • Diagnostic Effect

YPoll 1.2 Question 3 Answer

A study was done to determine if various changes in work conditions would have an effect on how productive workers were during their shifts. In the study, a manager told the employees that he would be studying this and proceeded to introduce more frequent breaks, altered lighting conditions, or reorganized the workspace. To collect data on productivity, new cameras were installed to monitor the employees. The researchers found that almost any change to the workplace led to increases in productivity. This is an example of:

  • Placebo Effect
  • Interviewer Effect
  • Hawthorne Effect
  • Non-compliance
  • Diagnostic Effect

Key Terminology

  • Causation
  • Lurking Variables
  • Subject
  • Response variable
  • Explanatory variable / factor
  • Treatment
  • Control
  • Placebo
  • Double blind
  • Placebo effect
  • Diagnostic Bias
  • Data Ethics
  • Randomized Controlled Experiment
  • Randomized Block Experiment
  • Matched Pairs Experiment