Chapter One, Data Be Damned

by Siegfried Engelmann
December, 2004

Beginning in the 1960s policy makers identified the failure of schools to teach at-risk students and have introduced various reforms, each designed to solve the problem, but each a fairly thorough failure. The failure of the educational system was not that research did not identify what works in teaching at-risk students, but that the facts of what works were not disseminated because they were not compatible with the educational policy makers beliefs and prejudices. Because of policymakers' refusal to face facts, but instead to use their power to distort the facts, millions of children who became school failures and failures as citizens could have succeeded in school. The tactics that policy makers used are illustrated most emphatically by the history of compensatory education for disadvantaged children.

The Problem

Before any problem can be solved, there must be a clear identification of the problem. The problem with at-risk African Americans was outlined in great detail by the Coleman Report of 1966. Based on achievement data on 600,000 students, the report concludes that a great performance disparity existed between the at-risk black students and whites.

Part of the report compared schools of equal physical characteristics serving only African Americans with those serving whites. The finding was that money spent on smaller classes, laboratories, counseling, higher teacher salaries, and higher teacher qualifications had no effect on academic achievement. The irony of the finding is that proposals for higher teacher salary, higher qualifications and the like persist today.

Also, if the physical characteristics of the schools and all the other factors made no difference in performance, only two possibilities remain. One is that black students are natively inferior to whites. The other is that the instruction blacks received was completely inadequate. Because there is no evidence of native inferiority, the second possibility should stand as a clear indictment of the educational system and a clear premise for resolutions that instruction had to improve. The most obvious and direct course of action would therefore be to search for instructional approaches and teaching techniques that had substantial evidence of increasing the achievement of blacks. If none are identified, a secondary tactic would be to commission various instructional designers and exponents of different educational strategies to create approaches and document their effectiveness. The test of effectiveness would be straight forward, an increase in achievement rates of blacks. If there is a lack of agreement about which specific instructional approaches produce the best results, the most direct response would be to set up a controlled experiment that fairly compared the approaches. No circuitry is implied by this description of the problem.

The facts of school failure imply that if there were effective practices, those responsible for the design of the instructional practices, the training of the teachers, and the management of the school lacked knowledge of them. Furthermore, all instructional programs, teacher trainers, and school management practices used in failed schools would be suspect until being vindicated by actual data. The ultimate implication is that those responsible for the design or implementation of these unsuccessful practices would not be considered experts. The outcomes they achieved-the performance level of the students—were inferior and unacceptable.

Devious Solutions to the Problem

Educational policy makers did not pursue the direct implications of the problem. Instead they searched for circuitous ways to solve the problem, approaches that would not address the technical details of instruction and management. Instead, they searched for what amounts to non-educational magical solutions to address their educational problems.

The trick that they used was to redefine the problem. Instead of facing it as an instructional problem, they framed it as a social problem, rooted in history and caused by discrimination against African Americans. This solution did not require technical knowledge about teaching students or managing schools, and it completely avoided the problem of designing instruction that would accelerate the rate at which students master academic content.

By posturing the students' performance as a social issue rooted in historical inequities and segregation, the solution would now involve some form of non-segregation. Actually it would involve changing history, which is impossible. The next best thing would be to make amends.

The problem with this logic is that the issues of history beg the question about what should be done now. At the time the redefinition of the problem occurred, there were millions of black children in kindergarten, first, and second grade. How could any social agency change them without addressing their deficit in skill and knowledge and accelerating their performance of learning academic content? Changing society would certainly be effective over a long period of time. Income patterns would change, followed by changes in child-rearing practices. Within a few generations, the effects of the inequities would be erased. However, a third-grade student who reads on the level of a beginning reader needs a remedy that recognizes his problem and addresses it systematically while he is still in the third grade, not years after he is a grandfather.

Once the problem of poor performance was categorized as a social inequity, policy makers faced the practical task of how to achieve "equity." The vehicle for reform came from Thomas Pettigrew, who re-analyzed the Coleman data and discovered that black students attending mostly white schools had achievement levels much higher than those in segregated schools. Also, in these schools, the white students' performance was no worse than that of whites in segregated schools.

Policy makers used these correlations to solve the problem of poor performance without ever facing it. They confused correlation with causation and reasoned that if the black children in integrated schools performed higher, putting black children into white schools would cause them to perform as well as the blacks in integrated schools. The romantic transformation would occur because black children would bond with their white classmates, which would result in a kind of cultural exchange.

Discrimination Through Bussing

Ironically, the solution was based on blatant prejudice. Policy makers apparently thought all blacks were the same, those who lived in neighborhoods that were integrated and those from the inner city. The analysis did not take into account the difference in the homes, the language models, and the other differences in child-rearing practices between integrated blacks and urban blacks. The relevant difference was that when integrated blacks entered school they were far advanced over the inner-city black in skill, knowledge, and language proficiency.

To integrate blacks and whites, decision makers turned to a completely non-educational vehicle—the bus. Inner-city children would be bussed to predominantly white schools. The plan was costly, both in dollars and time. Some students would spend more than two hours a day on a bus. Even if a possibility existed that the integrated school would cause higher performance in blacks, rational decision makers would have tested the school-integration formula before committing thousands of black children to what proved to be certain failure. After all, delaying wholesale Bussing three years is less damaging than gambling everything on a plan that has never been demonstrated to work.

A small-scale experiment with careful observations of the students would have revealed the inhumanity of this plan. The greatest irony of Bussing is that it was conducted in the name of "equal opportunity". The contorted logic used to arrive at the conclusion that opportunities for all students were equal, assumed that most of the grades—like second and third—were really not necessary and that white children who were beginning second grade could skip both second and third grade and function in a fourth grade classroom with no problem. It apparently didn't occur to the decision makers that many of the black fourth graders performed at the beginning-second-grade level in reading, math, and language. To place them in the fourth grade of a well-performing white school would be to assume that they were able to skip second and third grades, and that the power of the integrated school would somehow provide the skills these students missed.

The fact that the policy makers and bureaucrats recognize that these grades are necessary for white children shows just how unequal the learning opportunities were for black students. Children who are placed far beyond their skill level will fail because the work is too difficult for them. Furthermore, they will be turned-off to instruction. The placement based on age, not ability, was both thoughtless and highly discriminatory. The blacks did not receive instruction that was proven to work well with lower performers.

The most tragic aspect of the decision to bus was that it would have taken decision makers only a couple of hours to secure dramatic data on just how outrageous the plan was. All they had to do was to go into a classroom of an at-risk black school, hand some of the fourth-graders fourth-grade reading material and tell them, "Read this out loud."I've seen such demonstrations provided for community workers. And I've seen tough male adults get tears in their eyes after observing the painful performance of the students and repeat, "I had no idea.

I had no idea." In 1970, a principal of a white school to which blacks were bussed succinctly told me the reason for "white flight" from integrated schools. "We're supposed to have standards here. We don't do social promotions. So if I place black kids were they belong, more than 75% of them would be in special ed. If I put them in special ed, I'm a racist. If I leave them in the regular classrooms and flunk them, I'm either a racist or an ogre who doesn't understand affirmative action. So what do I do, close my eyes, sell out our standards and socially promote them, or go to another school?"

Direct Instruction

The equal-opportunity plan had addressed instructional issues principally through Head Start. Title 1 provided additional funds for schools to accommodate failed students, but there was no systematic compensatory education. Head Start was violently opposed to systematic instruction. It was designed to provide nutrition, a happy atmosphere, and "stimulation" for low-income preschoolers. Head Start was modeled after the traditional nursery school. There was no hard data that this format resulted in gains for inner-city blacks or even of affluent whites. Again, the policy makers did not search for programs that had data of effectiveness and fashion Head Start after these. Instead the Office of Economic Opportunity installed a playschool, even though there were other programs that had data of effectiveness.

The most effective preschool was at the University of Illinois which taught children through a method called Direct Instruction. The approach had significantly raised the IQ of black preschoolers and showed that disadvantaged four- and five-year-olds could learn beginning reading, mathematics, and a host of language skills.

This approach was not welcomed by early childhood educators or sociolinguists, who labeled it everything from an inhumane pressure cooker to a thoughtless approach that did not recognize "Black English" and tried to change healthy language patterns. Years later, in Meaningful Differences in the Everyday Experience of Young American Children, Hart and Risley vindicated the Direct Instruction practice of teaching language concepts directly. The investigators documented the differences in the exposure to concepts and language between affluent homes and those of at-risk children. The differences are enormous, amounting to hundreds of thousands of exposures per year for various language concepts. Even in 1966, however, anybody who seriously worked with at-risk children knew that they had serious language concept deficits and that any instructional effort would have to start here. If the policy makers had done something as simple as present three tasks to inner-city preschoolers, they would have quickly discovered how far behind some four-year-olds were.

The three tasks:

"Take this ball and put in on that table."
"Now take this ball and put it under the table."
"Now take this ball and hold it over the table."

For a lot of lower performing children, the ball would end up in the same place-on the table. Investigations of other language concepts would have disclosed the same order of deficiency.

Investigations later confirmed that both Bussing and Head Start were disasters. A 1972 study, "The Evidence on Busing," showed that black students in Boston who were bussed to white schools did not improve in performance over black students who were not bussed. Head Start earned the same score as Bussing. In 1968, an extensive evaluation of Head Start, the Ohio-Westinghouse Study, concluded that children going to Head Start showed no long-term cognitive gains over children who did not go to Head Start.

Project Follow Through

Not all the policy making in the 60s was naïve. The Office of Education and the Office of Equal Opportunity performed a landmark study that showed that one approach significantly outperformed others in grades K through 3. The study began in 1968 as the largest educational experiment ever conducted, although it is all but unknown among educators-Project Follow Through-which involved over 500,000 students in more than 180 communities. Originally conceived as part of President Lyndon Johnson's War on Poverty, Follow Through was intended to maintain the gains that were achieved in Head Start (although there were no real gains). It was designed as a "horserace" among various models of instruction and was billed as the definitive experiment of what works best in teaching disadvantaged children in the primary grades. Eighteen different sponsors of educational approaches were selected by the Office of Education. Local community parent groups each selected one of these models to be implemented in their neighborhood schools.

The evaluation of Follow Through occurred in 1976, after the various sponsors had enough time to implement their models in participating sites. The project was evaluated by two independent agencies, Stanford Research Institute and Abt Associates.

One model, Direct Instruction, was the overwhelming winner. It not only taught a larger number of students than any other model (over 100,000), it also served the largest number of communities (20). Of the 18 models, DI achieved the highest scores in all academic areas and resulted in children that had the most positive self-images. The DI model's third graders achieved first place in reading, math, spelling, and language. DI placed first in urban communities and first in rural areas, first with blacks, non-English speaking children, and Native Americans. DI was also first with non-poverty students that were included in Follow Through.

Data Suppressed

With such data support, the outcome of the "horserace" would seem to be unquestioned. But not in education, as bussing and Head Start illustrate. Policy makers applied the same data-be-dammed approach to the Follow Through results that they did to Head Start and Bussing. They ignore, or reject data that is not consistent with their prejudices.

In 1976 the final report on Follow Through was released. It contained no information about individual models. Instead, it concluded that Follow Through failed, and therefore that compensatory education failed. No winners were recognized and no losers were identified. The reason was that all the approaches that had been strongly endorsed by districts, foundations, and the educational press had failed.

The suppression of the Follow Through data on different models was spearheaded by the Ford Foundation. Follow Through models based on the Ford Foundations philosophy performed below the level of the children who received no Follow Through. The foundation hired Ernest House and Gene Glass to critique the embarrassing results, which were scheduled to be disseminated by the National Institute of Education (NIE). The NIE report showed performance by model, leaving little doubt about the poor performance of sponsors who focused on discovery learning, child-centered practices, and programs that followed Piaget's logic of lavishly using manipulatives to progress from the concrete to the abstract.

More Poor Reasoning

The Glass-House critique was published in 1978 in the Harvard Educational Review and was widely read. The main argument for discrediting the Office of Education evaluation was what amounted to a simple philosophical assertion that sponsors should not be compared. So in effect, the ultimate implication was that Follow Through should not have occurred because its goal was to provide comparative data about what works best.

In addition to the Glass-House critique of Follow Through, Gene Glass wrote an appeal to NIE, indicating why the results of Follow Through should not be disseminated. He actually argued against evaluations that presented empirical evidence. He urged NIE to replace such studies with "Those emphasizing an ethnographic, principally descriptive case-study approach to enable informed choice by those involved in the program.

"NIE accepted the arguments for not presenting data by model and released no comparative data, simply a statement about the aggregate performance of the models. The aggregate performance was terrible; Follow Through failed. On the whole, Follow Through students did not perform better than (or in some cases, as well as) those who did not participate in Follow Through but went through traditional Title 1 programs.

In the end, all the Follow Through models were "validated," so the status quo was maintained. If policy makers wanted to believe that inducing positive self-images would make children feel more capable of learning, nothing could contradict this fantasy. In the same way, the policy makers could continue to believe in instruction based on student choice, extensive parent involvement, discovery learning, and reading through sight-word methods, they could do so with a clear conscious. The data barrier had been removed.

Today's Myths

Thirty years later, models that were egregious failures in Follow Through are popular, particularly High/Scope, an early-childhood program that had achievement levels in reading, math, and language significantly lower than those of comparable children who did not participate in Follow Through. The average third grader in this program read at the first grade level.

The suppression of Follow Through data underscores the extent to which educators redefine problems and eschew data. Head Start, Title 1, and Follow Through were prompted by facts about performance in the form of empirical data. If empirical evidence is used to identify the problem, empirical evidence is needed to show the extent to which the problem has been solved. Also, if empirical data is used to determine that the aggregate of the Follow Through models failed, why wouldn't it be used to identify the performance of the individual sponsors that contributed to the overall failure?What happened in Follow Through provides evidence that when facts are pitted against educational prejudices about what should work, prejudices prevail. Political power proved to be a lot more powerful than black power. In 2002, an informal survey of school and district administrators disclosed that less than half of them had ever heard of Direct Instruction model, and less than one tenth of them ever heard of Project Follow Through.

College classes on public policy, such as that conducted by Gary Klass at Illinois State University, address American Education Policy and the history of attempts to identify what works with at-risk students. But history has been altered by educational policy makers. All the high-profile studies are there, but not Follow Through or Direct Instruction. For instance, Klass's unit provides a good synopsis of the Coleman Report, the failure of Bussing and Head Start, and the other major developments that led to policy change. However, Follow Through is completely missing from the outline. The only thing it has to say about compensatory education (which is what Follow Through addressed) is:

"Compensatory education program show no effect"
(lilt.ilstu.edu/gmklass/pos232 [class notes, American Education:
"What Works"]).

Interestingly, the outline contains some information about what Klass believes works. It lists questionable correlates like wearing uniforms. It also identifies one study that followed up 66 students who went through a preschool program. The suggested benefit of this program is that it resulted in less crime and fewer special-education assignments. The preschool program was High/Scope, a complete failure in Follow Through. The follow-up study Klass cites was not rigorous in design and the claims have been extensively contradicted by more sophisticated studies. Even more outrageous is that the Follow Through study had over 500,000 students, which is 7576 times the number in the High/Scope study.

So political forces in education are able to change history and shape reality not only of the unsophisticated public, but of historians as well. These political forces distort both sides of the truth by discrediting success and making failure look like success. Rather than using data that addresses the problem, the political forces in education prefer what they call ethnographic, case-history data, which amounts to little more than anecdotal accounts, with no clear rules about how to use them. Do we judge a program to be better than another if it has more anecdotes or if it has "more convincing" anecdotes? Or do we simply count them? It certainly would not be very fair for a model that serves 100,000 children to be judged better than a model serving 5,000 children because the larger model has more "good anecdotes." And wouldn't it be reasonable to secure anecdotes from all the students? (How much do you like math? How good are you at math? Etc.)

Current Trends

The disregard for data provided by the history of compensatory education is the rule, not the exception, in education. The National Council of Teachers of Mathematics (NCTM) may hold the title of promoting the most paradoxical anti-scientific and anti-intellectual practices. The NCTM formally rejected empirical studies that showed the effects of different teaching methods. Its basis for this rejection: "The results were disappointing." Imagine an organization that is supposed to understand math, which includes statistics, asserts in effect that mathematical truth is falsity.

The NCTM has supported a long list of failed practices. At the top of the list is "discovery learning." It doesn't seem to matter how many times or how thoroughly empirical evidence shows that this practice is ineffective.

The NCTM is not the only professional organization of educators that promotes un-scientific notions. Its cousin, the National Council of Teachers of English, has staunchly supported failed practices such as the Whole Language approach to teaching reading, an unsystematic approach that liberally uses "literature" to teach reading in the beginning grades. These practices and others the NCTE has promoted have been shown to be ineffective, particularly with at-risk populations. Several Follow Through models used a version of Whole Language. They completely failed.

The International Reading Association (IRA) has an almost unblemished record of promoting approaches that have no evidential base. In general, the IRA supports the mottoes of progressive education, such as John Dewey's notion of "Learn by doing and do by doing," which disdain systematic preparation in subjects like reading and instead simply introduce reading, with the idea that children would learn to read if they read.

The IRA endorses Whole Language. It argues that language is learned naturally through interactions that are not highly structured. Reading is language. Therefore, reading should be learned through the same kind of casual interactions that succeed with language learning. It doesn't work.

Science Organizations Opposed to Science

Some of the more ironic rejections of data and science come from organizations that serve science teachers. An example is the National Science Teachers Association, which has a membership of 55,000 science professionals including science and math teachers. Although the association recognizes science in other fields, it does not apply scientific principles or logic to the teaching of science. This unusual prejudice was revealed in 2004 when California delivered a serious blow to one of the organization's sacred cows-teaching science through heavy doses of hands-on experiments. One of the criteria California proposed for evaluating K-6 science instructional materials limited the amount of "hands-on" activities. "Comprising no more than 20 to 25 percent of science instructional time."

This limitation is reasonable because there is no empirical evidence that a heavy diet of hands-on activities is a worthwhile use of time, particularly for at-risk populations. The use of hands-on activities is based on problematic theories about how children learn. According to the theories, work with manipulative or hands-on material is supposed to be the basis for children internalizing the content and formulating concepts. The primary exponent of this philosophy was Jean Piaget. In Follow Through, four models applied Piaget's principles in their design. All of these models failed.

A curious response to the proposed California criteria came in the form of a letter to each state board member, with carbons to everyone from the Secretary of Education to Governor Arnold Schwarzenegger, pleading the case for hands-on material. The response was curious because it was signed not only by the executive director of the National Science Teachers Association, but also by the president of the National Academy of Sciences, Dr. Bruce Alberts. Understand that the Academy is among the most prestigious organizations in the world, composed exclusively of the most distinguished scientists in each field. A scientist doesn't simply "join" the academy. Membership is through invitation only.

So this response from such a prestigious group should carry the emphatic sanction of the science community and should be based on carefully reasoned research evidence and sound logic. In fact, the response was not only naïve, but was largely based on anti-scientific reasoning.

In response to the criterion that limited use of hands-on approaches, the petitioners stated, "teachers need to be able to make the decision about which instructional strategy will best teach a particular concept if a teacher needs to present the concept through instruction that is 50 or 75 percent hands-on, the teacher must have the flexibility, and the resources to do so." The idea that teachers know which instructional strategy will best teach a particular concept is richly contradicted by research. It is scientifically illogical. If teachers make independent decisions about what's best, the effectiveness of the outcomes will tend to follow a normal distribution curve. On such a curve only a small but arbitrary percentage (possibly 5%) will identify or create what is best (based on performance of children). The rest will range from second best to Nth best. So the slogan and the assumption that the "average teacher" will identify what's best, is fanciful. Consider the un-addressed question. If a teacher fails to teach content that has been demonstrated to be teachable, how can the teacher "know best"?

The petitioners used the "teacher knows best" argument to discredit another California criterion. They wrote, "Teachers are the experts in how and when to teach particular materials, not textbook publishers. The criterion denigrates the ability of teachers to exercise their judgment about how best to meet the needs of their students." The rhetoric may be appealing but is quite antiscientific. Of the Follow Through models, the Direct Instruction model provided the greatest amount of control over what the teachers did and exactly how they did it. This model produced the best results. So the same teacher who would fail with "teacher-choice practices" would be able to succeed with the practices and sequences specified by Direct Instruction programs, even though the decisions about what to do or how to do it were dictated by the program, not the teacher's intuition.

The petitioners' final attack challenges whether Direct Instruction is superior to other approaches. "There is no research to suggest that Direct Instruction is superior to any other instructional strategies." According to this assessment, not only had Follow Through been erased from the record; so had more than 50 other studies involving Direct Instruction in a variety of content areas. For example, a school in Baltimore was the lowest and most notorious school in the district when it initiated the Direct Instruction model in 1997. The school was City Springs Elementary, and its ranking was 117th of 117 elementary schools. In 2003, City Springs ranked overall first in the district. A transformation of this magnitude has never been recorded by any other approach.

More relevant to science instruction was a comparison of middle school students who were taught the basic principles of chemistry and energy. One group was composed of failed students in a special class. The other group consisted of advanced-placement students. The students in a special class went through a Direct Instruction program. The advanced-placement students went through a more traditional, experimental approach. On a posttest that presented items and problems involving basic principles of chemistry and energy, the failed students performed as well as the AP students.

Although Alberts's motives in protesting hands-on instruction are not known, a possibility is that the National Science Teachers Association promotes a program that Alberts originated, City Science. The program is expressly designed for teaching science to "urban students" in elementary school and beyond, and the program is clearly based on the philosophy of hands-on manipulations. It is apparently well intentioned but presents no empirical data to suggest that this approach is successful in improving student performance, or that it would perform as well as a Direct Instruction approach.

Conventional Wisdom Prevails

The fact that non-scientific reasoning and rejection of data extend to science organizations implies that education is fundamentally different from other enterprises, in which decisions are not circumscribed by data or shaped by logic. The fact that policy makers hold degrees, and show all the signs of being well educated and well intentioned provides absolutely no guarantee that what they recommend will be based on evidence of effectiveness or any knowledge of what works. Certainly not every decision maker in education is a bozo. The problem is that there are few clues from what they argue or how they argue that suggest whether what they embrace is ethereal or solid.

According to Grover Whitehurst, Director of the Institute of Education Sciences, Assistant Secretary of Education, only about 10% of current educational decisions are based on evidence of effectiveness. That means that the probability is about 10 to 1 that the opinions of randomly selected educational decision makers, whether they are in charitable foundations, schools, or state governments, is based on folk psychology and "traditional wisdom." Opinions of these people are not the stuff from which productive educational reform will emerge. If we are committed to serious educational reform, particularly for at-risk populations, the first step we must take is to recognize that educational decision makers lack the skill, knowledge, and respect for data that the task demands. They are probably uninformed about what actually occurs in classrooms of urban schools, and they probably hold strong antiscientific beliefs that lead to poor judgments about what is "best for the children." They are not the sole cause of the problems, but they certainly are not the hope for the solution, so long as they remain uninformed.

In summary, the problem is this:

The educational system fails because it has a disregard for data. This disregard is nearly universal, even among those who cite data. The field's nonscientific stance pre-empts it from shaping educational practices by using the techniques that characterize scientific or systematic endeavors.

The solution is implied by the problem:

Install people who respect and understand data.

In other words, put the kids first and use data on their performance as the ultimate yardstick of what actually works.

References

Alberts, B. & Wheeler, G. (2004, March 4). Letter to California State Board of Education members. Retrieved January 14, 2005 from http://science.nsta.org/nstaexpress/lettertocaliffromgerry.htm

Armor, D. J. (1972). The Evidence of Busing. Public Interest, 28, 90-128.

Coleman, J. (1966). Equality of educational opportunity. Washington, DC: United States Government Printing Office.

Dewey, J. (1916). Democracy and education. An introduction to the philosophy of education (1966 edn.), New York: Free Press.

Ed. gov Education Innovator (2003). City Springs Elementary School: Fulfilling the No Child Left Behind promise. Author. Aug. 18, 2003, Number 25.

\Hart, B., & Risley, T. R. (1995). Meaningful differences in the everyday experience of young American children. Baltimore, MD: Brookes Publishing.

House, E. R., Glass, G. V., McLean, L. F., & Walker, D. F. (1978a). No Simple Answer: Critique of the "Follow Through" evaluation. Harvard Educational Review, 28 (2), 128-160.

Klass, G. M., Political Science 232 Course information. Retrieved December, 2004 from http://lilt.ilstu.edu/gmklass/pos232

Pettigrew, T. F. (1975). Racial-discrimination in the United States. New York: Harper & Row.

Research Advisory Committee of the National Council of Teachers of Mathematics (1995). Research and practice. Journal for Research in Mathematics Education, 26 (4), 300-303.

Stallings, J. (1975). Implementation and child effects of teaching practices in Follow Through classrooms. Monographs of the Society for Research in Child Development, 40 (7-8, Serial No. 163).

Stebbins, L. B., St. Pierre, R. G., Proper, E. C., Anderson, R. B., & Cerva, T. R. (1977). Education as experimentation: A planned variation model (Vol. IV-A: An evaluation of Follow Through). Cambridge, MA: Abt Associates.

Westinghouse Learning Corporation (1969). The impact of Head Start: An evaluation of the effects of Head Start on children's cognitive and affective development. Athens, OH: Ohio University.

Whitehurst, G. J. (2002). Address given to the Council of Scientific Society Presidents December 9, 2002. Washington, DC.