The program includes a plan for the evaluation of school reform implementation and student achievement results.
Introduction
Program evaluation is often overlooked or data are collected in a hurried fashion by staff with little time or expertise. On the other hand, it is not unusual for staff to spend hours developing questionnaires and surveys and gathering data only to find the analysis leaves them with unanswered questions. This section will provide a quick and concise overview of program evaluation and will include some tools for developing evaluation plans. We begin with some definitions.
From Definitions to the Big Picture
EvaluationDelineating, collecting, and analyzing data to provide information for making decisions. Program evaluation is a systematic process designed to determine the effectiveness of a particular program (whole program focus).
Formative EvaluationEvaluation designed to gather data that will help improve a program during its operation (during implementation). Formative implementation evaluation generates information used to guide decision making about the program's desirability, feasibility, fidelity, and soundness.
Summative EvaluationSummative evaluation involves the collection of data necessary for judging the ultimate success of the entire program.
TriangulationThe idea of using more than one data source to confirm findingsto compare sets of data to see if the findings all are in agreement. For example, all teachers might have attended model-developer training (1), and the majority may ãself-reportä that they are, in fact, implementing the new strategies proposed by the model (2). Independent observers conducting classroom observations (3) may not, however, see evidence of meaningful change in classroom practice. Without this third data set, one might conclude that the strategies are, in fact, being faithfully implemented. If summative evaluation later showed no progress in student achievement, one might conclude that the strategies were unsuccessful, when in fact, they were never implemented to the level that research shows can have an impact on student achievement.
BenchmarksA set of reference points between existing levels of conditions and expected levels or goals that serve as measures of progress toward the desired conditions or goals. For example, student performance benchmarks are specific achievement levels expected for a given group of students at given points in time. Teacher implementation benchmarks reference changes in classroom practice across the staff, over a period of time. If done well, the evaluation could not only show where teachers are with implementation but also provide data to understand why some are having difficulty, thus enabling individualized support to help them reach benchmarks.
ResearchObtaining generalized knowledge by contriving and testing claims about relationships among variables (narrowly defined focus). For example, a research study might focus on the implementation of a particular set of strategies and the obstacles faced by those trying to implement them. The findings would be of interest to anyone trying to implement the same strategies under similar circumstances. The research findings would certainly be of great interest to those who were studied, but would not provide all the data needed to track progress across the entire program. That is more within the scope of the Program Evaluation plan with both process (formative) and outcome (summative) measures.
Ongoing Research
National
evaluation of CSR efforts include the following:
- Examining baseline information
- Conducting large-scale longitudinal data collection
- Conducting focused studies of implementation and impact
- Looking
at CSR in the field through selected site visits and
implementation reports
- Examining state and district data on local implementation
The
federal CSR legislation also mandates that state and local
education agencies (SEAs and LEAs) evaluate implementation
and measure results achieved in improving student academic
achievement. The state level evaluation of CSR implementation
and outcomes varies from state to state. Additional information
on a particular state's evaluations of CSR is available
through the CSR
Coordinator at that SEA.
Schools and districts have at least two categories of evaluations:
- Program implementation or formative evaluation
- Student performance data or summative evaluation
Locally collected data should have a direct impact on the decisions made at the local level to revise and improve the comprehensive plan each year.
Reviewing BenchmarksWhat, When, How, Who, and Why?
Evaluation consists of the following seven steps:
- Focusing
- Planning
- Collecting
- Analyzing
- Reporting
- Action planning
- Finding and Using Resources
The steps constitute a feedback loop. Data drive actions; these actions are evaluated; data help refine the school's next actions.
Table 1 provides a set of first questions to ask in developing a useful evaluation plan, while Table 2 provides follow-up questions. A small group of school personnel and other stakeholders who are willing to look critically at the current evaluation plan may use these questions to begin the process. It is desirable if some of the people in the group helped develop the current plan; some have already been collecting and analyzing data; and others are new to the process with less ownership in the previous/current plan. All should be committed to learning new things about evaluation and using data to drive reform.
A staff can use Table 3, Benchmarking Comprehensive Reform Initiatives, to benchmark initiatives and provide a timeline of key events in the implementation of the comprehensive plan. Table 4 will help a group set up measurable goals and objectives.
The following tools are designed to help staff members begin to think carefully about the purpose of evaluation and how they might develop a useful evaluation plan.
Because schools are not static, all planning and evaluation takes place in the midst of, even layered on top of, previous planning and evaluation efforts. If an evaluation plan has been developed and needs to be refined, staff can start by carefully reviewing the previously gathered data and the information they have provided. This will help determine whether some evaluation tasks should be dropped (they are not providing useful information) and whether some important aspects of program implementation are not being monitored.
Digging Deeper into Benchmarks
Table 5 focuses on benchmarks for changes in instruction and facilitates the examination of the data intended for collection to determine whether these data will actually aid in making decisions for improvement.
Developing Instruments, Analyzing, and Reporting Data
Issues in data collection are
- Selection or development of instruments
- Use of qualitative versus quantitative data
Data analysis issues include
Other Resources
Bernhardt, V. L. (1998). Data analysis for comprehensive schoolwide improvement. Larchmont, NY: American Educational Research Journal.
Bernhardt, V. L. (1999). The school portfolio: A comprehensive framework for school improvement. Larchmont, NY: Eye on Education.
Hassell, B. (1998). Comprehensive School Reform-Making Good Choices: A Guide for Schools and Districts. Oak Brook, IL: NCREL.
Herman, J. L., & Winters, L. (1992). Tracking your school's success: A guide to sensible evaluation. Newbury Park, CA: Corwin Press, Inc..
Isaac, S., & Michael, W. B. (1997). Handbook in research and evaluation for education and the behavioral sciences (3rd). San Diego, CA: EdITS/Educational and Industrial Testing Service.
Evaluation, J. C. (1994). The program evaluation standards: How to assess evaluations of educational programs (2nd). Thousand Oaks, CA: Sage Publications.
Lezotte, L. W., & Jacoby, B. C. (1992). Sustainable school reform: The district context for school improvement. Okemos, MI: Effective School Products.
Worthen, B. R., Sanders, J. R., & Fitzpatrick, J. L. (1997). Program evaluation: Alternative approaches and practical guidelines (2nd). New York, NY: Longman Inc.