THE DESIGN used in this study1quasi-experimentalis
unfamiliar to clinical readers. There are few easily available worksheets
or prescriptive methods for reviewing such an article. We used 2 excellent
resources to generate questions used in the evaluation of this article,2, 3 modified by 2 of us (M.H. and W.B.),
and reordered here to match the sequence of the study. The graphic (Figure 1) helps to arrange the methodological
issues we raise throughout the article. In the Web-based version we provide
links, depicted in the graphic, to provide further detail about the various
methodological considerations and threats to validity.
|
|
|
|
Study flow diagram. Each study phase engenders specific biases,
and provides the basis for the discussion.
|
|
|
PRIOR TO THE STUDY
Was the Problem Clearly Stated?
Yes. There have been few long-term evaluation studies of educational
interventions to reduce the high rates of pregnancy and sexually transmitted
diseases (STDs) (including human immunodeficiency virus) in adolescents in
the United States. These public health problems are clearly linked to high-risk
sexual behaviors in adolescents, specifically in those adolescents living
in urban environments. In this study Siegel et al1
address this problem by evaluating a sexual risk prevention intervention for
middle school (MS) and high school (HS) students in an urban population.
What
Design
Was Used in This Study? Is This Design Experimental
or Quasi-Experimental?
The investigators used a quasi-experimental design with 3 school-based
sexual risk prevention intervention groups and 1 school-based health education
control group.
Are the Hypotheses Chosen by the Authors Intuitive and Reasonable?
(Minimize
Threats due to Inadequate Conceptual Model)
Yes. The study is based on the hypothesis that the rates of teen pregnancy
and STDs are determined by the high-risk sexual behaviors of adolescents.
Using findings from prior work in this area, the authors make a reasonable
assumption that by providing sexual risk prevention education to youth, we
can reduce the prevalence of high-risk behaviors and, thus, reduce the incidence
of pregnancy and STDs.
POPULATION AND GENERALIZABILITY
Did the Authors Accurately Define the Target Population? Were
Attempts Made to Select a Representative Sample?
Yes. Teen pregnancy and STDs are more prevalent in adolescents living
in urban environments. To assess the effect of a sexual risk prevention education
program for this target population,
the investigators sampled a diverse group
of MS and HS students living in a large urban northeastern US city. The authors
clearly describe the distribution of students for race, ethnicity, and socioeconomic
status (based on available census tract data). This large study sample seems
to be well-representative of a US urban school-based population.
Was the Intervention Replicated in a Similar Setting?
(Minimize
Threats due to Generalizability)
No. To our knowledge, this is the first study to assess the effect of
this specific sexual risk prevention education curriculum on long-term knowledge,
attitudes, and practices. However, the intervention in the study was implemented
in 10 schools within geographic confines, with students of similar ages and
backgrounds.
RECRUITMENT AND BASELINE ASSESSSMENT
Were Detailed Demographic Characteristics of the Participants Obtained?
Yes. Demographic characteristics were obtained from the adolescents
participating in the study. These characteristics included age, sex, grade,
ethnicity, history of sexual intercourse experience, and proxies for socioeconomic
status.
Was There an Adequate Determination of Base Rates Used in Any
Independent or Dependent Variable?
Yes. All independent and dependent variables were assessed on the participants
prior to receiving the intervention or the control group's educational curriculum.
These variables included baseline scores for knowledge, sex self-efficacy,
safe behavior intention, sexual intercourse history, and sexual risk behaviors.
Did the Authors Delineate All Factors That Might Be Related to Outcome? Did
They Choose Reasonable, Appropriate, and Independent Variables?
The best statistical measure available to answer this question is R2 the percentage of variability explained
by the independent (purportedly "explanatory") variables. The models in this
study had a wide range of values for
R2 (0.12-0.54). This variability indicates that there are other factors related
to the outcome that were not included in the analysis.
Some factors at the individual respondent level that might contribute
to the outcome include the adolescent's religion,4
his or her strength of support groups (ie, number and quality of friendships,
as well as the quality of the relationships with parents and other family
members),5, 6 the adolescent's
perception of risky behaviors of his or her friends,7, 8
and the influence of the media's portrayal of sexuality.9
Another factor for adolescent girls could include a participant's intention
to become pregnant. Although the investigators included this factor in the
sex risk behaviors score, it may be a stronger factor than the others assessed
in the construct.
Factors related to the outcome at the school level include the rate
of transience in the school, the school's atmosphere (this was controlled
for in the study by including a class climate variable), and the rates of
pregnancy and STDs in the schools. Understandably, many of these factors are
difficult to assess in a self-administered survey, and the authors did an
excellent job at attempting to cover the most appropriate variables.
Were Pretest Measures Used to Detect a Pattern or Trend Prior to the
Intervention? (Minimize the Effect of
Maturation Bias)
Yes. The investigators measured the subjects' scores prior to the intervention,
and each individual's pretest score was included as a covariate in the analysis.
It is understandable that there were probably no data for trends in high-risk
sexual behaviors in the study sample prior to the intervention. However, it
would have been helpful to see data on any trends in the rates of pregnancy
or STDs for students in the schools included in the study. We are, therefore,
unable to assess the effect of any maturational changes that may influence
the outcome of interest. For example, if during the study period high-risk
sexual behaviors were trending upward and the authors find a decrease in self-reported
behaviors in the intervention group (compared with control subjects), the
improvement would have even greater significance.
ASSIGNMENT
Was Random Assignment
to Intervention Groups Used?
No. The study was quasi-experimental. As noted by the authors, randomization
would not have been possible for conducting a large school-based study without
interfering with routine school practice. The assignment of classes to intervention
or control group was based on feasibility and availability of peer educators.
To strengthen the
external validity
of the findings, however, all health classes
in each of the MSs and HSs were assigned to both the experimental and control
groups. In addition, although the authors were unable to randomize students,
they appropriately assessed baseline demographics and pretest scores and controlled
for differences in their analysis.
Did the
Sample Size
Used Have Adequate
Power
to Detect a Difference
Between Intervention and Comparison Groups?
Yes.
Were Any Differences in Group Characteristics Found on Pretesting Adjusted
for in Statistical Analysis? (Minimize Threats due to
Selection Bias)
Yes. Differences in group characteristics were adjusted for in the analysis.
Demographics (age, sex, socioeconomic status, and ethnicity), the proportion
of male and female students who reported sexual experience, and all other
study variables (knowledge, self-efficacy scores, and others) were analyzed
after stratifying by MS and HS groups, as well as by sex.
INTERVENTION
Were Multiple Pilot Tests or Pretest Observations Obtained? (Minimize
Threats due to
Instrumentation or Interaction With Selection That Differentiate the Groups)
Yes. The survey was pilot tested on 450 students prior to the study.
Was the Intervention Implemented Consistently and Reliably?
The intervention needed to be implemented without too much disruption
of the routine MS and HS curricula. The investigators went to great efforts
to ensure consistent and reliable implementation at all sites participating
in the study. There could have been variability in the teaching styles and
effectiveness of the educators in the participating schools. However, the
Rochester AIDS Prevention Project for Youth (RAPP) adult health educators
were described as highly trained and the RAPP peer educators received 50 hours
of intensive training. Both RAPP adult and peer educators taught classes as
pairs. While attempts were made to maintain consistency and reliability, it
is not clear if the educators' teaching ability was assessed prior to the
study implementation. In addition, an observer could have ensured consistency
in the content received in the intervention and control groups across participating
sites.
Was Contact Between Intervention and Control Groups Minimized? (Minimize
Threats due to
Treatment Contamination, Atypical Responses, and Reporting Bias)
Because the study occurred at multiple sites, and the long-term evaluation
lasted from 14 to 80 weeks after the intervention, students may have moved
to different schools. As such, it could have been possible for students to
receive education from both the intervention and control groups. The investigators
minimized this problem by assigning unique identifiers for both the students
and the schools. Therefore, if a student did change schools, the investigators
would be aware of it. In addition, at no time did intervention and control
classes occur in the same school during a given semester.
Was a
Process Evaluation
Used to Monitor the Administration of the
Intervention? Was the Intervention Process Evaluated Systematically?
No. It is not clear in the article if a process evaluation was performed.
We are given information about attrition rates in the MSs and HSs, but we
do not know much about the day-to-day activities that occurred during the
study. It would have been helpful to know the average number of sessions each
student attended. Perhaps this factor could not only be used to monitor the
study, but also it could have an effect on the outcome evaluation. We are
also not told about the average number of students in each class. This factor
may be included in the climate score variable, but it would seem important
to determine class size as a process variable.
Were Attempts Made to Minimize Data Recording Errors
(Such as
Double Data Entry)?
We were unable to determine this from the information provided in the
article.
OUTCOMES
Did the Authors Specify the Expected Outcomes?
Yes. The authors clearly specified the expected outcomes for the study.
The outcomes included knowledge, sex self-efficacy, behavior intention, and
behaviors including the onset of sexual intercourse experience and engagement
in risky sexual behaviors.
Were the Dependent Measures Used Valid and Reliable?
Yes. Due to a scarcity of reliable and valid measures in the existing
literature, the authors designed scales that were pretested in a pilot study
of 450 students. The scales were based on prior research, previously validated
instruments (Youth Risk Behavior Survey), consultation with experts in adolescent
health, and the Theory of Reasoned Action.10
Reliabilities were calculated for the MS and HS students separately using
the Cronbach
to assess internal consistency and test-retest reliability.
Both statistical measures revealed high reliability.
Were Multiple Indicators Used to Assess Each Dependent and Independent
Variable? (Minimize Threats due to
Unreliable Operational Definitions, Poor Generalization of Construct to Other Constructs)
Yes.
ANALYSIS
Were Appropriate Statistical Tests Used?
Yes. The authors used repeated-measures analyses of variance to assess
differences between intervention and control groups stratified by sex and
type of school (MS or HS). In addition to being in the intervention or control
group, the investigators wished to investigate ethnicity and history of sexual
intercourse experience as major factors associated with the outcomes. As such,
each of these factors was entered one at a time controlling for the following
covariates: demographics, general life risk score, mean score for length of
time since the intervention, the class climate score, and the pretest score
for each individual. Considering that the quasi-experimental design is not
frequently described in the pediatric literature, it would have been helpful
if the authors included issues related to interpreting the particular statistical
tests used in their analysis.
Did the Authors Use Baseline Assessment to Effectively Evaluate If
Their Intervention Improves Behavior and/or Knowledge?
Yes. As mentioned previously, the authors conducted a pretest with all
participants to assess baseline knowledge and behavior. Since the same scales
were used in the long-term follow-up survey, the measures can appropriately
be compared to assess differences from the intervention.
Do the Authors Satisfactorily Address the Alternative Hypotheses That
Might Explain the Observed Effect?
The authors describe a positive long-term effect of the RAPP intervention
(compared with the control group) in knowledge, self-efficacy, behavior intention,
and self-reported behaviors. Although statistical significance was not achieved
in all of these areas, scores in the intervention groups were in the "safer"
direction. They found that the participant's pretest score and report of sexual
history to be major predictors of posttest scores. The authors do suggest
that the positive effects from the intervention could have been due to the
high attrition rates (most notable in the 12th graders). Those participants
who were lost to follow-up may have had no change, or possibly a worsening
in their outcome scores at long-term follow-up. Appropriately, the authors
examined the pretest data from these students and found them to be higher
sexual risk students than students who remained in the study.
OVERALL
Would You Recommend That the Editor Accept This Article for Publication?
Yes.
What Additional Information Would You Request From the Authors for
Revisions?
As mentioned previously, a major concern we have relates to the large
number of dropouts in this study. Attrition rates were 45% for HS students
and 28% for MS students. As noted, many of these students at pretest were
in the higher-risk sexual behavior group, and losing them may have made the
positive results of the study look more impressive. Clearly, those students
who were lost to follow-up represent a truly different group than those that
were maintained in the study. In fact these dropouts may contribute disproportionately
more to the problems of teen pregnancy and STDs than those students who remain
in school. A future challenge, therefore, would be the development of sexual
risk prevention interventions to specifically target these most at-risk youth.
CONCLUSION
The major conclusion of the study is that when implemented and evaluated
in a large MS and HS sample of urban youth, the RAPP intervention was most
helpful in improving safer sex knowledge, self-efficacy, behavior intention,
and self-reported behaviors in students who were not engaging in high-risk
behaviors prior to the intervention. The authors illustrate this point in
their final statement of the article, and we look forward to seeing the RAPP
intervention implemented and evaluated on a younger, sexually inexperienced
sample of students.
Our own conclusion from this article is that the investigators used
a thoroughly planned design, a difficult but successful implementation, and
a comprehensive and intensive evaluation. Although the generalizability of
quasi-experimental designs is always in question, in this case the statistical
analysis used is appropriate, the interpretation of results is clear and understandable,
and the overall findings are compelling and believable.
FURTHER READING
Behi R, Nolan M. Quasi-experimental research designs. Br J Nurs. 1996;5:1079-1081.
Cook TD, Campbell DT. Quasi-Experimentation: Design
and Analysis Issues for Field Settings. Boston, Mass: Houghton Mifflin;
1979.
Freemantle N, Wood J, Crawford F. Evidence into practice, experimentation
and quasi-experimentation: are the methods up to the task? J Epidemiol Community Health. 1998;52:75-81.
Morgan GA, Gliner JA, Harmon RJ. Quasi-experimental designs. J Am Acad Child Adolesc Psychiatry. 2000;39:794-796.