Week 4 Reading Notes Mertens, Chapter 4. Experimental and Quasi-Experimental Research The

Week 4 Reading Notes
Mertens, Chapter 4. Experimental and Quasi-Experimental Research
The importance of experimental design in the positivist paradigm
Independent and dependent variables, experimental and control groups, random assignment, and internal and external validity are defined and illustrated
Threats to internal and external validity
How to minimize threats to internal and external validity
Research designs for experimental
Research design for quasi=experimental
Research design for single group studies
Other design issues
Type of treatment variable
Ordering effects
Matching subjects
Complex issues concerning experimental research in the transformative paradigm
Introduction
Quantitative research is rooted in the postpositivist paradigm
Post Positivists believe that purpose of research is to develop confidence that particular knowledge claim about an educational or psychological phenomenon is true or false by collecting evidence in the form of object observations of relevant phenomena (Gall, Gall, Borg, 2015)
Research design: a process of creating an empirical test to support or refute a knowledge claim
Knowledge claim tests in the postpositivist paradigm
Internal validity: Is the knowledge claim true in this situation? (does it have internal validity?)
External validity: Is the knowledge claim true in other situations? (does it have external validity or generalizability?)
Most quantitative research is two types
Studies aimed at discovering causal relationship or strength of relationships or differences between groups
Descriptive studies that use quantitative data to describe a phenomenon
Importance of Experimental Design section
Disagreement that experimental research is the only way to truly establish a causal relationship among researchers
Some issues are not compatible with experimental research and would be very difficult to complete
“Gates and Dyson (2017) explain several ways to establish causality, the use of randomized controlled trials being one of them.” (p.129)
You can also establish causality through the qualitative, narrative approach
Post Positivists establish causal claims on the basis of controlling for extraneous variables
Constructivists utilize ways of developing causal claims through systematic investigations of the actual causal processes
What Works Clearinghouse- reviews RCT, quasi-experimental designs, regression discontinuity designs, and single-case study designs.
Four types of studies they believe provided scientific evidence for success in education
APA has also chosen designs they believe are successful
Researchers in the postpositivist paradigm recognized the complexity of establishing a definitive cause-and-effect relationship with social phenomena.
This means as many variables as possible must be controlled and then manipulated
The control of so many variables, specifically, unique ones like background characteristics can result in oversimplification
However, control is how the researchers can make a claim regarding one variable; therefore, post positivists must work within the balance between the two areas
Post-positivists understand the flaws and “…are careful to acknowledge that their results are “true” at a certain level of probability and only within the conditions that existed during the experiment.” (p.129)
This poses a threat to external validity
Research Designs and Threats to Validity
IV affects the DV and the DV is affected by the IV.
Group that gets the treatment (whatever thing is different/being tested) is the experimental group
Control group does not receive the treatment
Random assignment: every person has an equal chance of being in either the experimental or control group
Experimental research is fundamentally defined by the direct manipulation of an independent variable
APA characterizes studies in the following three ways
Those that manipulate an experimental intervention
Those that do not manipulate an experimental intervention
Those that use single case
Internal Validity
Internal validity: the changes observed in the dependent variable are due to the effect of the independent variable, not to some other unintended variable
Extraneous/lurking variables/alternative explanations/rival hypotheses: the changes observed in the dependent variable are due to the effect of another variable other than the independent variable
Eight extraneous variables that are a threat to internal validity
History: events that happen during the course of the study that can influence the results
How can history be controlled?
Having a control group that is exposed to the same events during the study as en experimental group, with the exception to the treatment
Maturation: refers to the biological or psychological changes in the participants during the course of the study
Changes: becoming stronger, more coordinated, ro tired as the study progresses
How can maturation be controlled?
By having a control group that experiences the same kind of maturational changes- the only difference being that they do not receive the experimental treatment
Testing: a threat that arises in studies that use both pre- and posttests and refers to becoming “test-wise” by having taken a pretest that is similar to the posttest. The participants know what to expect, learn something from the pretest, or become sensitized to what kind of information to “tune into” during the study because of their experience with the pretest
*Testing is a potential threat to validity only in research studies that include both pre and posttests
How can testing be controlled?
If your pretest is different from your posttest, you can alleviate this threat, and if all the participants take both tests
Instrumentation: uses both pre- and posttests and arises when there is a change in the instrument between the pre and posttests
One test could be easier or more difficult than the other, which then can affect the changes observed on the dependent variable due to the nature o the instrument, not the independent variable
Researchers used a pretest to document the similarities of the comparison groups prior to the intervention
How can instrumentation be controlled?
“Instrumentation is a potential threat to validity only when the difficulty of the instrument used for data collection changes from one observation time period to the next
Statistical Regression: occurs when the researcher uses extreme groups as the participant (i.e., students at the high or the low end of the normal curve)
How can statistical regression be controlled?
Only a problem in studies that use a pretest to select participants who are at the lowest or the highest ends of the normal curve and then test again using that instrument for the dependent measure
Differential Selection: If participants with different characteristics are in the experimental and control groups, the results of the study may be due to group difference, not necessarily to the treatment or the independent variable
How can differential selection be controlled?
By doing extra work, comparing two groups’ data and characteristics to ensure participants are evenly distributed by background that could be a lurking variable
Differential selection is theoretically controlled by the random assignment of participants to conditions because differences should balance out between and among groups
Experimental Mortality: refers to participants who drop out during the course of study. It becomes a threat to validity if participants differentially drop out of the experimental and control groups
How can you control experimental mortality?
Theoretically controlled by the random assignment to conditions under the assumption that randomization will lead to the same level of dropouts in both experimental and control groups
Researchers can test this assumption by having pretest measures that allow them to determine if people who drop out of the study are systematically different from those who complete it
This limits the generalizability of the results
Selection-Maturation Interaction: combines the threats to validity described previously under differential selection and maturation; however, maturation is the differential characteristic that causes the groups to differ
Treatment group is all younger than the control group
Other potential extraneous variables
Experimental Treatment Diffusion
Participants talk to each other and the control group learns what the independent variable is and begin using some of the ideas themselves.
In this situation, the researchers should conduct observations of selected classes to determine if the control group has become contaminated and should also conduct interviews with the participants to determine their perception of what they are doing
Compensatory Rivalry by the Control Group
Known as the John Henry effect
Some individuals who think that their traditional way of doing things is being threatened by a new approach may try extra hard to prove the their way of doing things is best
Compensatory Equalization of Treatments
Members of a control group may become disgruntled if they think that the experimental group is receiving extra resources
Resentful Demoralization of the Control Group
The opposite of the John Henry effect
The control group may feel demoralized because they are not part of the “chosen” group and thus their performance might be lower than normal because of their psychological response to being in the control group
If this happens, the control group in the setting cannot be considered “unbiased.”
Social Validity: refers to how people who are participants in the intervention feel about it
Do they like it?
Do they see the value of this particular intervention for themselves and for broader society?
This reflects that the participants are active agents who can make choices about participating or not
External Validity or Generalizability
External validity: is the extent to which findings in one study can be applied to another situation
“Brach and Glass (1968) describe another type of external validity, termed ecological validity: which concerns the extent to which the environmental conditions created by the researchers to other environmental conditions
10 factors that influence ecological validity
Explicit Description of the Experimental Treatment
The IV must be sufficiently described so that the reader could reproduce it
Should be as specific as possible to eliminate the ways something could be interpreted
Multiple-Treatment Interference
If participants receive more than one treatment, it is not possible to say which of the treatments or which combinations of the treatment is necessary to bring about the desired result
The Hawthorne Effect
Derives from a study at the Western Electric Company of changes in light intensity and other working conditions on the workers’ productivity
The light intensity didn’t matter
The workers’ productivity increased under bother conditions
“Seemingly, the idea of receiving special attention, of being singled out to participate in the study, was enough motivation to increase productivity
Novelty and Disruption Effects
A new treatment may produce positive results simply because it is novel or the opposite may be true: A new treatment may not be effective initially because it causes a disruption in normal activities, but once it is assimilated into the system, it could become quite effective
Experimenter Effect
The effectiveness of a treatment may depend on the specific individual who administers it (e.g., the researcher, psychologist, or teacher).
The effect would not generalize to other situations because that individual would not be there
Pretest Sensitization
Participants who take a pretest may be more sensitized to the treatment than individuals who experience the treatment without taking the pretest
This is especially true for pre tests that ask the participants to reflect on and express their attitudes towards the phenomenon
Posttest Sensitization
Similar to pretest sensitization
Taking the posttest can influence a participant’s response to a treatment
Taking a test can help the participant bring the information into focus in a way that participants who do not take the test will not experience
Interaction of History and Treatment Effects
An experiment is conducted in a particular time replete with contextual factors that cannot be exactly duplicated in another setting
If specific historical influences are present in a situation (e.g., unusually low morale because of budget cuts), the treatment may not generalize to another situation
Measurement of the Dependent Variable
The effective of the program may depend on the type of measurement used in the study
Interaction of Time of Measurement and Treatment Effects
The timing of the administration of the posttest may influence results
For example, different results may be obtained if the posttest is given immediately after the treatment as opposed to a week or a month afterward
How to achieve perfect internal validity?
The laboratory is the perfect setting with a nice, clean, sterile environment in which no variables operate except those that you, as the researcher introduce
How to achieve perfect external validity?
The research should be conducted in the “outside” world, in the clinic, classroom, or other messy, complex, often noisy environments in which the practitioner will attempt to apply the research results.
Of course, all the “noise” in the outside world plays havoc with the idea of testing the effects of a single variables while eliminating the influence of others
“As researchers move to increase the external validity, they sacrifice internal validity.” (p.136).
It is important to make known the context in which the research was conducted
Other Threats to Validity
Treatment fidelity: the implementer of the IV fails to follow the exact procedures specified by the investigator for administering the treatments
How to control fidelity?
Researchers should try to maximize treatment fidelity by providing proper training and supervision and developing strategies to determine the degree and accuracy of the treatment as implemented
Researchers can train their implementers very carefully
Researchers can collect data as to the integrity of the implementation in a number of ways
“Erickson and Gutierres (2002) suggest that a considerable portion of the research budget should be devoted to documenting the treatment as delivered inorder to provide answers to the question, “What was the treatment, specifically?” qualitative research methods will need to be employed to gather evidence on the integrity of the treatment.” (p.136).d
Observation and teacher logs are common ways to gather evidence on the integrity of the treatment
Researchers can analyze the date within groups to determine if there are patterns of differential impact in the experimental group
Have to determine what contextual factors other than the treatment might account for the difference in performance
Treatments are often implemented by different people differently making it very difficult for the researcher to verify that the treatment was implemented as intended
The Experimental Treatment: An experiment to determine the effectiveness of an innovative teaching or counseling strategy can last for a few hours or for days, weeks, months, or years
This may not be reasonable
If the study results do not show evidence that the treatment was successful, this may not mean that the approach was ineffective but simply that it was no tried long enough
Intervention designed to change behaviors, attitudes, and knowledge often require more time than would be possible in a short experiment of one or two sessions
Experimental, Quasi-Experimental, and Single Group designs
Coding system
R= Random assignment of subjects to conditions
X= Experimental treatment
O= Observation of the dependent variable (e.g., pretest, posttest, or interim measures)
Pretest- Posttest Control Group Design
Participants are randomly assigned to either the experimental group or the control group. It is depicted as follows:
Other
ROXO
RO O
The blank space in the second line between the two Os is used to indicate that this group did not receive the experimental treatment
The experimental group receives the treatment, and the control receives either no treatment or an alternative treatment
How does this design control external and internal threats to validity
This design controls for the effects of history, maturation, testing, instrumentation, and experimental mortality by the use of the control groups and for differential selection by use of random assignment to conditions
Posttest-Only Control Group Design
Similar to the pretest-posttest control group design except that no pretest is given. It is depicted as follows:
Other
RXO
R O
How does this design control external and internal threats to validity
This design controls for the effects of history, maturation, testing, and instrumentation by the use of the control groups and for differential selection by use of random assignment to conditions
Mortality can be a problem if people drop out of the study!
Single-Factor Multiple-Treatment Designs
An extension of the randomized control group designs presented previously, but here the sample is randomly assigned to one of several conditions (usually three or more groups are uses).
The design is depicted as follows
Other
R O X1 O
R O x2 O
R O O
The Os represent the pre-and post questionnaire on strategy use that was administered before training and after all other post-testing
Internal threats to validity are controlled by randomly assigning students to conditions and by having comparison groups (two experimental groups and one control group).
Solomon 4-Group Design
Developed for the researcher who is worried about the effect of pretesting on the validity of the results. The researcher combines the pretest-posttest control group design with the posttest-only control group design
The design looks like this
Other
ROXO
RO O
R XO
R O
The disadvantage to this design is that it necessitates having four groups and thus increases the number of participants that one would need to test
Factorial Design
When multiple independent variables are included
Each IV is called a factor
The research tests the effects of the main variable as well as their possible interactions:
Other
A
B
A x B
AxB refers to the interaction between A and B
Commonly used because they allow researchers to test for effects of different kinds of variables that might be expected to influence outcomes, such as grade level, age, gender, ethnicity or race, or disability types
THe main limitation in the number of variables arises from the number of participants needed for each condition and the resulting complexity in interpretation of the results
Other
A
B
C
A x B
A x C
B x C
AxBxC
Cluster Randomization Design
Hierarchical linear regression
Examples: schools were randomly assigned to conditions not students
The housing organization is assigned not the units within the organization
Quasi-Experimental Designs
Those that are “almost” true experimental designs, except that the participants are not randomly assigned to groups
The researcher studies the effect of the treatment on intact groups rather than being able to randomly assign participants to the experimental or control groups
Static-Group Comparison Design
Involves administering the treatment to the experimental group and comparing its performance on a posttest with that of a control group
It is depicted as follows
Other
X O
………………
O
The dotted line is used to indicate that the participants were not randomly assigned to conditions
The two main threats to this design are:
Differential selection, because the groups might differ initially on an important characteristic
Experimental mortality if participants drop out of the study
Collect as much background information as possible about the two groups to determine how they differ!
Nonequivalent Control Group Design
Similar to static group comparison design except for the addition of a pretest
It is depicted as follows
Other
O X O
……………..
O O
This design controls for differential selection and mortality somewhat by the use of the pretest because the researcher would be able to determine if the two groups differed initially on the dependent variable
Regression-Discontinuity (R-D) Design
Designed for situations where an experimental design is not possible but where there is a demonstrated need for services
The treatment the participant receives depends on their scores on a prior measure called the quantitative assignment variable, or QAV.
People who score above a specific cutoff value on the QAV receive one treatment, and tho sebelow the cutoff receive the other treatment
Used when random assignment is not possible but when the researcher wants to approximate the rigor of experimental designs
Generally tends to rule out more validity threats than do “weaker” quasi-experiments, such as the one-group pretest-posttest design
Offers promise of avoiding ethical criticism of the randomized experiment, with little, if any, loss of internal validity
Single-Group Designs
Sometimes researchers are limited to collecting data from a single group; therefore, they do not have a comparison group
Other
XO
Maturation is an uncontrolled threat to validity in this design because the students could have matured in their ability to understand numerical concepts
This design is very weak and does not allow for a reasonable inference to the effect of the experimental treatment
Tutor and student example- can’t rule out that is was specifically, the tutor that helped the student do better
One-Group Pretest-Posttest Design
Participants are measured on an outcome variable both before and after treatment of interest
The researcher’s hope is that if the treatment is effective, outcome scores should improve, while scores will hold steady if the treatment has no effect
A variety of validity threat exist, including maturation and statistical regression
This design is represented as follows:
Other
O X O
This design is stronger than the one-shot case study because you can document a change in math scores from before the treatment to after its application
However, this design is open to threats of history
Without a control group who might have had the same experiences except for exposure to the experimental treatment, you are limited in your ability to claim the effectiveness of your treatment
This design is also open to threats of testing or instrumentation (if the pre- and posttests were different)
Mortality is not a threat because you have pretest data on the students at the beginning of the experiment, and thus you could determine if those who dropped out were different from those who completed the study
Necessary to use in a situation where there is no control group because the school would not allow differential provisions of services
Justified under circumstances in which you are attempting to change attitudes, behavior, or knowledge that are unlikely to change without the introduction of an experimental treatment
Time Series Design
Involved measures of the DV at periodic intervals.
Administered between two of the time intervals
Depicted as follows:
Other
O O O O X O O O O
This design is based on the logic that if the behavior is stable before the introduction of the experimental treatment and it changes after the treatment is introduced, then the change can be attributed to the treatment
Biggest threat is history because the experiment continues over a period of time and there is no control group who might experience the “historical event” but not he treatment
Other Design issues
Types of Treatment Variables
Ordering Effects
Researcher is concerned that expose to one treatment before another would have different effects than if the treatments had been administered in reverse order
Can be counterbalanced by some participants receive one treatment first and some receive the other treatment first
Matching
A researcher might choose to try to match participants on variables of importance (gender, age, type of disability etc,.)
By matching pairs between the treatment and control groups, the researcher can control for some extraneous variables
Problems in finding the “perfect” match
Participants for whom no match can be found must be eliminated
Can be problematic
Challenges to Using Experimental Designs in Education and Psychological Research
Ethical problems using humans
Transformative Perspectives Regarding Experimental Research
Transformative researchers are divided as to the appropriateness of using single-group, experimental, and quasi-experimental designs for education and psychological research
Feminists express concerns over the rigidity needed to control extraneous variables and about sex bias in research
Common- research is done with all-White participants and is generalized to minority populations
Ethical arguments against randomized assignment:
providing a service to someone who truly needs it because they need it, not because of random assignment
Difficult logistically and ethically to randomly assign people to treatment or control groups in the real world
Is it ethical to deny one group treatment on a random basis?
Postpositivist Rejoinder
Concerns about denial of treatment could be addressed by offering the more effective treatment to those in the other group after the study is over
Assessing the benefit-risk ratio- argue that the opportunities would not have been present if the study had not been conducted
Stop rules are another technique that can help attenuate risk
Stop rule: an explicitly protocol to cease an experiment after a significant treatment effect of a specific magnitude is observed in preliminary analyses
Final Thoughts
Feminists who don’t oppose experimental methods have used them to expose myths about the inferiority of women as well as to raise consciousness on a number of other issues
Experimental research can be used to reveal persistent social and education disparities
However, such research might not be necessarily used as a basis for change because that is not an inherent ethical principle in the postpositivist paradigm
Scott-Jones discusses possible resolutions for the ethical dilemmas associated with experimental designs
Give the treatment to the control group after the experimental group receives it
Have two (or more) potentially beneficial treatments so all participants receive some intervention
Compare the treatment group outcome to some carefully chosen standard
Conduct intra-individual comparisons rather than cross-group comparison
Use individual baselines as comparison standards
Include minority researchers as part of the research team