Improving Student Achievement

This article presents information about the recent increased emphasis on using scientifically based research to improve student achievement and teacher effectiveness. It explains new vocabulary, discusses research challenges, and describes how scientifically based research can be turned into effective practice. The booklet also provides an overview of the What Works Clearinghouse and offers resources and tools for using data to make decisions in schools.

Teachers and administrators alike are challenged by the No Child Left Behind legislation to incorporate scientifically based research into their decision making for programs and practices that will improve student achievement. Education researchers also are challenged to produce studies that are faithful to scientifically based principles. Now more than ever, practitioners and researchers need to link their efforts to address student learning. Finding new opportunities for educators, policymakers, and researchers to work together on behalf of schools affected by the new legislation is a challenge, and it instills hope that scientifically based research will provide better direction and evidence that student achievement can improve.

Teachers and administrators alike are challenged by the No Child Left Behind Act of 2001 to incorporate scientifically based research (SBR) into their decision making for programs and practices that will improve student achievement. Educational researchers also are challenged to produce studies that are faithful to scientifically based principles. Now more than ever before ytugyuibytu, practitioners and researchers need to link their efforts to address student learning as a result of the No Child Left Behind legislation. The challenge is to base practice on rigorous evidence that specific programs will work to guide teaching and learning and, at the same time, to understand the reality that this type of research is not readily available or understood by most administrators and classroom teachers. Finding new opportunities for educators, policymakers, and researchers to work together on behalf of schools affected by the new legislation is a challenge, and it instills hope that SBR will provide better direction and evidence that student achievement can improve.


This essay outlines the place of SBR in the No Child Left Behind Act of 2001, provides details about the mandate for SBR in the Comprehensive School Reform legislation, and explains the rationale and challenges for using SBR in making education decisions. It also provides important definitions and outlines tools for translating research into practice.

Federal – and State-Level Policy Options


  • Raise standards and expectations for all students. Judy Jeffrey, the Chief State School Officer in Iowa, argues that state leaders need to engage the community to raise awareness and create a sense of urgency about the need for high school reform. Such dialogue with key stakeholders would build the political will necessary to raise standards at the state level to ensure that U.S. high school students become competitive with their international peers.
  • Decrease dropout rates. Several experts highlighted the dropout crisis that affects many U.S. high schools, particularly those in urban areas. States and districts need to develop programs that will prevent dropouts, increase student engagement, and provide a variety of options for out-of-school youth to successfully complete a high school diploma and continue on to postsecondary education.
  • Develop long-term goals. Hilary Pennington, of Jobs for the Future, contends that federal and state policymakers need to focus less on the results of annual tests and more on the long-term goal of increasing the number of students who complete college. This long-term goal should drive education policy at the state level.


  • Federal and state policies to support innovation. Pennington argues that students should have access to a variety of different pathways to complete a postsecondary degree or credential. All of these pathways should hold students to high academic standards but also should provide them with choices that meet a variety of student interests and needs. In order to encourage educators to think outside the box of the traditional high school, state and federal policymakers should develop grant programs that fund educators who would like to experiment with a variety of high school models. The funding agency would then evaluate the impact of these innovative models and disseminate the results to allow other educators to build on the successes of the models.


  • District support for innovation. Pennington contends that districts can provide schools with more autonomy over budgets and staffing to promote innovation at the school level. Chris Steinhauser, superintendent of Long Beach (California) Unified School District, highlights the positive outcomes that resulted from his district’s success with allowing principals considerable autonomy in return for meeting the district goals of raising expectations for all students, providing support for students in mathematics and reading, and expanding access to Advanced Placement courses.


  • Develop innovative teacher preparation programs. Ellen Guiney, of the Boston Plan for Excellence, has developed a teacher preparation model at the district level to prepare teachers in Boston Public Schools for the challenges of working with low-performing students in an urban school system. This model offers in service training, mentoring, and support with curriculum materials that are tailored to the school system’s goals. States and districts should engage institutions of higher education in a dialogue about teacher preparation programs uyginobui to ensure that teachers have command of their content and are highly skilled with a wide array of pedagogical methods that will enable all students in their classrooms to be successful with a challenging curriculum.



  • Target additional resources to those schools with low-performing students. Robert Balfanz, Ph.D., of the Center for the Social Organization of Schools at Johns Hopkins University, argues that state and federal policymakers need to increase funding and revise policies to ensure that additional resources are targeted to both elementary and secondary schools that serve large numbers of low-performing students who need extra support. In return, schools should be held accountable for using resources wisely.


  • Identify opportunities to coordinate K-12 and postsecondary funding streams.

In most states, average daily attendance (ADA) funds finance high schools. If high school students are participating in a program at a local community college, the high school may lose ADA funding because students are not present at the high school for the entire day. In contrast, colleges and universities typically receive state funds based on full-time equivalent (FTE) student enrollment. However, four-year colleges may not be able to claim FTE student enrollment reimbursement for students who are still in high school. Pennington contends that policymakers should explore opportunities to fund a variety of programs that will allow high school students to earn college credit while in high school.

Exploring New Approaches to Policy

Although it will take some time to improve the quality of the research base related to high school improvement strategies, policymakers and practitioners continue to forge ahead with the development of innovative efforts to improve the quality of high school education in their states, districts, and schools. High school reform efforts are receiving increased national attention.


At the federal level, the U.S. Department of Education launched “Preparing America’s Future,” the Secretary’s High School Initiative, in October 2003. In conjunction with this initiative, the U.S. Department of Education hosted regional summits in the spring of 2004 and national summits in October 2003 and December 2004. The themes of this initiative are high expectations, student engagement and options, teaching and leadership, and accelerated transitions. The Education Department’s Web page includes links to research and examples of programs, as well as a number of papers and materials developed specifically for the initiative.


A second significant advocate for national high school reform efforts has been the Bill and Melinda Gates Foundation. The foundation is committed to improving the quality of high school education and expanding the development of small schools through a grant program that is being implemented across the country. Nearly $1.2 billion has been invested in high school improvement efforts. The goal of the program is to improve high school graduation and college preparedness rates by fostering dynamic high schools that help all students prepare for college and work through a rigorous and challenging curriculum, stronger relationships between students and teachers, and more relevant coursework.


Spurred by the leadership of the National Governors Association (NGA), state leaders also have been actively involved in high school reform. NGA recently launched a yearlong initiative to improve U.S. high schools called “Redesigning the American High School,” emphasizing many of the same themes as the U.S. Department of Education. Further information about this initiative is available online. In partnership with the Gates Foundation, NGA also recently announced high school redesign grants that will provide a significant amount of funding for 10 statewide high school improvement plans in Arkansas, Delaware, Indiana, Fouisiana, Maine, Massachusetts, Michigan, Minnesota, Rhode Island, and Virginia. A Comprehensive High School Improvement Plan for States

To prepare high school graduates for postsecondary education and for work, the National Governors Association recommends that state leaders implement the following improvement policies:

Restore value to the high school diploma.

  • Anchor high school academic standards in the real world.
  • Upgrade high school coursework.
  • Create college- and work-ready tests.

Redesign high schools.

  • Reorganize low-performing high schools first.
  • Expand high school options in all communities.
  • Provide support to low-performing students. Give high school students the excellent teachers and principals they need.
  • Improve teacher knowledge and skills.
  • Provide incentives to recruit and keep teachers where they are needed most.
  • Develop and support strong principal leadership. Set goals, measure progress, hold high schools and colleges accountable.
  • Set goals and measure progress.
  • Strengthen high school accountability.
  • Intervene in low-performing high schools.
  • Strengthen postsecondary accountability. Streamline and improve education governance.
  • Enable K-12 and postsecondary systems to work more closely together.

The policy recommendations of the experts featured on the accompanying CD overlap with several of NGA’s action items. NGA and the Viewpoints experts agree that policymakers should increase the rigor of the high school curriculum, provide additional services for low-performing students, and support educators who are experimenting with tyufyintg innovative high school programs. However, the experts on our CD focused on the classroom as the critical arena for change while NGA has targeted its recommendations to state-level policymakers. The recommendations from the experts we interviewed fall into three general categories: accountability, innovation, and funding. Details about these recommendations are discussed in the section that follows.

Essay Length and Holistic Score

This text examines the relationship between essay length and holistic scores assigned to Test of English as a Foreign Language (TOEFL) essays by e-rater, the automated essay scoring system developed by ETS. Results show that an early version of the system, e-rater99, accounted for little variance in human reader scores beyond that which could be predicted by essay length. A later version of the system, e-rater, performs significantly better than its predecessor and is less dependent on length due to its greater reliance on measures of topical content and of complexity and diversity of vocabulary. Essay length was also examined as a possible explanation for differences in scores among examinees with native languages of Spanish, Arabic, and Japanese. Human readers and e-rater show the same pattern of differences for these groups, even when effects of length are controlled.

The National Conned on the Testing of English as a Foreign Language developed the Test of English as a Foreign Language (TOEFL) in 1963. The Council was formed through the cooperative effort of more than 30 public and private organizations conceded with testing the English proficiency of non-native speakers of the language applying for admission to institutions in the United States. In 1965, Educational Testing Service (ETS) and the College Board assumed joint responsibility for the program. In 1973, ETS, the College Board, and the Graduate Record Examinations (GRE) Board entered into a cooperative arrangement for the operation of the program. The membership of the College Board is composed of schools, colleges, school systems, and educational associations; GRE Board members are associated with graduate education.


ETS administers the TOEFL program under the general direction of a policy board that was established by, and is afflicted with, the sponsoring organizations. Members of the TOEFL Board (previously the Policy Council) represent the College Board, the GRE Board, and such institutions and agencies as graduate schools of business, junior and community colleges, nonprofit educational exchange agencies, and agencies of the United States government.


A continuing program of research related to the TOEFL test is carried out under the direction of the TOEFL Committee of Examiners. Its 12 members include representatives of the TOEFL Board and distinguished English as a second language specialists from the academic community. The Committee meets twice yearly to review and approve proposals for test-related research and to set guidelines for the entire scope of the TOEFL research program. Members of the Committee of Examiners serve four-year terms at the invitation of the Board; the chair of the committee serves on the Board.


Because the studies are specific to the TOEFL test and the testing program, most of the actual research is conducted by ETS staff rather than by outside researchers. Many projects require the cooperation of other institutions, however, particularly those with programs in the teaching of English as a foreign or second language and applied linguistics. Representatives of such programs who are interested in participating in or conducting TOEFL-related research are invited to contact the TOEFL program office. All TOEFL research projects must undergo appropriate ETS review to ascertain that data confidentiality will be protected.

E-rater Scores

As noted earlier, automatic scoring systems, such as e-rater, require training data. E-rater measures numerous features of writing in its training essays and then uses a stepwise linear regression procedure to select the features (usually a small set of 8 to 10) that are most predictive of essay score for each prompt. While the overall feature set is highly varied, it contains no direct measure of essay length. As we described it in 1999,


The driving concept that underlies e-rater is that it needs to evaluate the same kinds of features that human readers do. This is why from the beginning of its development, we made it a priority to use features from the scoring guide and to eliminate any direct measures of essay length. Even though length measures can be shown to be highly correlated with human reader essay scores, length variables are not scoring guide criteria. Although some researchers have suggested adding explicit measures of length to improve correlations with human readers, the assessment community is generally concerned about the effect this would have on coachability. E-rater features are based on four general types of analysis: syntactic, discourse, topical, and lexical.


Syntactic analysis. The basis for syntactic analysis is parsing — the process of making explicit the syntactic structure of sentences. This requires tagging each word in the essay with its appropriate part of speech and then assembling the words into phrases and clauses. (One important improvement in e-rater has been in the quality of its syntactic analysis, due primarily to improved part-of-speech tagging.) The parser identifies several syntactic structures, such as subjunctive auxiliary verbs (e.g., would, should, might), and complex clausal structures, such as complement, infinitive, and subordinate clauses. Recognition of these features yields information about the essay’s syntactic variety. The parsed sentences also provide the input for discourse analysis.


Discourse analysis. Organization of ideas is another criterion that the scoring guide asks human readers to consider in assigning essay score. E-rater contains a lexicon based on the conceptual framework of conjunctive relations from Quirk, Greenbaum, Eeech, and Svartik in which cue terms, such as In summary and In conclusion, are classified as conjuncts used for summarizing. The conjunct classifiers contain information about whether or not the item is a kind of discourse development term, or whether it is more likely to be used to begin a discourse statement (e.g., First, Second, or Third).


E-rater also contains heuristics that define the synthetic or essay-based structures in which these terms must appear to be considered as discourse markers. For example, for the word first to be considered a discourse marker, it must not be a nominal modifier, as in the sentence, “The first time 1 went to Europe was in 1982,” in which first modifies the noun time. Instead, first must occur as an adverbial conjunct to be considered a discourse marker, as in the sentence, “First, it has often been noted that length is highly correlated with essay score.” The lexicon of cue terms and the associated heuristics are used by e-rater to automatically annotate a high-level discourse structure of each essay. These annotations are also used by the system to partition each essay into separate arguments, which are input to the topical analysis component.

District-, School- & Classroom-Level Policy Options

  • Provide a rigorous curriculum as the default curriculum for all students.

Tracking should be eliminated, and all students should be placed in a college-preparatory curriculum. Students should be provided with supports to succeed, but all students should be enrolled in a more challenging curriculum. Researchers such as Adelman have demonstrated that the best predictor for completion of a postsecondary degree is the completion of a challenging high school curriculum.

  • Raise the level of expectations for all students. Although all of our experts did not agree that a narrowly defined rigorous curriculum was the best choice for all students, they did agree that all students should be expected to achieve at high standards and that current expectation levels in high school need to be raised dramatically.



  • Encourage teachers to focus not only on their content area but also on the needs and interests of their students. Dr. McPartland and Hugh Burkett, Ph.D., director of the Center for Comprehensive School Reform and Improvement, stressed the need for teachers to change their instruction at the classroom level to enable all students to be successful with a challenging curriculum. Because many teachers rely on lectures as their main instructional method, they need research-based tools and professional development to become highly skilled with a wide range of instructional strategies that will enable all of their students to meet content standards, develop critical thinking skills, and engage in the intellectual dialogue of the discipline.
  • Create teams of teachers and students to increase the level of personalization for students and begin to change the culture of isolation for teachers. Although the Gates Foundation has strongly supported small schools as a way to increase personalization, many of the experts we interviewed noted that small schools are not a magic bullet. In fact, Dr. Stern and Valerie Lee, Ph.D., of the School of Education at the University of Michigan, stressed that the implementation of small schools can actually increase stratification and inequality and might not provide students with access to all of the courses they will need for college admission. These challenges can be met, but they must be anticipated and confronted with programmatic responses. More importantly, within the structure of either small schools or large comprehensive high schools, educators need to develop a range of support strategies for students, provide high-quality professional development for teachers, and create teams of teachers and students who work together to ensure that all students are successful.



  • Explicitly instruct students in reading. Michael Hock, of the Center for Research on Learning at the University of Kansas, stressed that improving adolescent literacy is key to high school reform because students will not be successful with the content if they cannot comprehend it. Hock has identified specific methods to teach reading at the high school level to make up certain kinds of deficiencies within short periods of time. Explicit instruction is necessary to enable students to work effectively with a variety of different types of text. Because many high school teachers have not been trained to teach reading, they need high-quality professional development about effective instructional strategies that will enable their students to improve their reading skills in the content areas.
  • Develop a range of supports for students. Dr. Balfanz painted a stark picture of schools with high dropout rates that serve disproportionate numbers of low-performing students. He argues that districts and schools need to provide a range of appropriate supports to engage these students in school and increase their proficiency levels. Schools need to develop comprehensive programs that address attendance, course failure, and student motivation and effort.

Defining SBR

According to the NCLB Act (2002), the term scientifically based research-.

(A) means research that involves the application of rigorous, systematic, and objective procedures to obtain reliable and valid knowledge relevant to education activities and programs and

(B) includes research that –

(i) employs systematic, empirical methods that draw on observation or experiment;

(ii) involves rigorous data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn;

(iii) relies on measurements or observational methods that provide reliable and valid data across evaluators and observers, across multiple measurements and observations, and across studies by the same or different investigators;

(iv) is evaluated using experimental or quasi-experimental designs in which individuals, entities, programs, or activities are assigned to different conditions and with appropriate controls to evaluate the effects of the condition of interest, with a preference for random-assignment experiments, or other designs to the extent that those designs contain within-condition or across-condition controls;

(v) ensures that experimental studies are presented in sufficient detail and clarity to allow for replication or, at a minimum, offer the opportunity to build systematically on their findings; and

(vi) has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review.

This definition is intended to encourage researchers to provide better and more useful evidence of what works and to challenge practitioners to make good decisions based on evidence. The difficulty is that few studies of education programs meet this definition in its entirety.

In 2002, AIR introduced two standards against which education research can be judged. The gold standard is research that meets all the requirements of SBR; the silver standard is research that meets the requirements but does not employ random sampling (AIR, 2002). The institute’s work, which was prepared for the U.S. Department of Education, also includes some guidelines that can be used by school staff and others to review education research. These guidelines include:

? The theoretical base of the program or practice, explaining specific goals followed by implementation activities.

? The evidence of effects, stating how the practice has demonstrated improved student learning.

? Implementation and replicability, explaining the degree to which the program has been successfully implemented in diverse settings.

CSR Model

CSR is usually initiated when individual school improvement efforts are not successful and assessment of data indicates that students are not meeting standards. Making Good Choices: A Guide for Schools and Districts offers a process for selecting a CSR model that requires identifying two or three models to find the best match between the model provider and the local school needs. Since 2001 and the reauthorization of the Elementary and Secondary Education Act, the standard for being an According to the NCLB Act, a Comprehensive School Reform school must implement a program that:

1) employs proven strategies and proven methods for student learning, teaching, and school management that are based on scientifically based research and effective practices and have been replicated successfully in schools;

2) integrates a comprehensive design for effective school functioning, including instruction, assessment, classroom management, professional development, parental involvement, and school management, that aligns the school’s curriculum, technology, and professional development into a comprehensive school reform plan for school-wide change designed to enable all students to meet challenging State content and student academic achievement standards and addresses needs identified through a school needs assessment;

3) provides high quality and continuous teacher and staff professional development;

4) includes measurable goals for student academic achievement and benchmarks for meeting such goals;

5) is supported by teachers, principals, administrators, school personnel staff, and other professional staff;

6) provides support for teachers, principals, administrators, and other school staff;

7) provides for the meaningful involvement of parents and the local community in planning, implementing, and evaluating school improvement activities consistent with section 1118;

8) uses high quality external technical support and assistance from an entity that has experience and expertise in school wide reform and improvement, which may include an institution of higher education;

9) includes a plan for the annual evaluation of the implementation of school reforms and the student results achieved;

10) identifies other resources, including Federal, State, local, and private resources, that shall be used to coordinate services that will support and sustain the comprehensive school reform effort; and

  1. A) has been found, through scientifically based research to significantly improve the academic achievement of students participating in such program as compared to students in schools who have not participated in such program; or
  2. B) has been found to have strong evidence that such program will significantly improve the academic achievement of participating children.


Approved CSR model, however, has been changed from using “innovative strategies and proven methods for student learning, teaching, and school management based on reliable research and effective practices” to a call for comprehensive reform programs that “employ proven strategies and proven methods for student learning, teaching, and school management that are based on scientifically based research and effective practices and have been replicated successfully in schools”

Broadening SBR Within NCLB

According to Slavin (2003), scientific research traditionally has played a relatively minor role in education reform since many innovative practices and programs are untested. When reform efforts fail, educators and policymakers move to implement a different set of innovations that also have untested claims, instead of adopting well-researched programs and practices that have been proven to work. Shifting to a new paradigm will mean changing practice to look more deeply to research-based programs rather than following a new trend.


Unlike most other fields of scientific inquiry, education places extraordinary emphasis on the new and the novel. Believing that the most recent theory — at whatever level of research — is also the most important, education leaders may lose sight of the value of seminal research and proven practices. Both Congress and the U.S. Department of Education are hopeful that with the introduction of new research standards and federal mandates, “evidence-based reform” will become the norm and set an expectation for using rigorous, experimental research to justify programs and practices.


In addition to Comprehensive School Reform and Title I legislation, SBR also is cited in Title II, Preparing, Training and Recruiting High Quality Teachers and Principals; Title III, Language Instruction for Limited-English-Proficient and Immigrant Students; Title IV, 21st Century Schools; Title V, Promoting Informed Parental Choice and Innovative Programs; Title VI, Flexibility and Accountability; Title VII, Indian, Native Hawaiian, and Alaska Native Education; and Title IX, General Provisions.


Each applicant for funding must demonstrate efforts to address SBR by describing how each activity will be based on a review of SBR and will be tied to evidence-based results. This has far-reaching implications for professional development in all content areas, teacher preparation programs in higher education, English language acquisition programs, safe and drug- free programs, parent involvement, mentoring programs, and all programs designed to address state and local student academic achievement standards.


Because of the extent of this legislation, all schools will be affected — not just those that have been in CSR programs or identified as in need of improvement. Using SBR will require establishing a culture of inquiry regarding how decisions are made to improve student learning. Leadership will matter at both the system and school level and must include teachers with high-quality professional learning to improve practice. Infusing SBR into school culture will require enhanced professional learning to increase understanding of the meaning and usefulness inherent in compelling research to drive practice.

Access to Evidence -Based Research

The hope that sound, rigorous educational research will reach practitioners raises the issue of “usability.” Researchers and practitioners alike agree that, “to be effective, education research needs to be both credible and usable”. With the focus on scientific experimentation, there is a concern that federal funding will not be available to support other kinds of research that practitioners also may find useful. Research studies may become “too academic,” not practical enough for replication and not in a language accessible to practitioners. Educators may “view those writings as inaccessible, arcane, and irrelevant to their everyday jobs” (Viadero, 2003).

On the other hand, education research has been criticized for securing scant evidence of affecting practice and improving student achievement, and skepticism regarding the use of federal funds to support research and development efforts has existed for over a decade (Kaestle, 1993).

Realizing both researchers’ challenge to produce SBR, and practitioners’ challenge to find and utilize SBR, the U.S. Department of Education, through the Institute of Education Sciences, has established the What Works Clearinghouse (WWC). It will become a resource to assist in educational decision making regarding scientifically based programs and practices. The WWC was created in August 2002 as a national project to provide the information decision makers need to make choices based on high-quality scientific research. This will be done through Web-based databases that will Practically Speaking: How Might Practitioners Put Scientifically Based Research to Work?


After a school learns that it is on the academic early warning list because some subgroups have not met standards on die state assessment for mathematics, a school improvement committee is formed to write a school improvement plan (SIP) that meets state criteria. The committee is especially concerned with creating an “action plan” that designates activities “supported by scientifically based research with a theoretical base,” as mandated in the state criteria. The state rubric specifically asks how the activities cited in the plan are supported by SBR, and what types of measures will be used to determine if the activities meet the needs of the low-achieving students.