By Bruce Kimmel
Norwalk school officials are in the midst of a detailed analysis of this past year’s CMT and CAPT tests. So far, based on preliminary discussions, it seems the district has been treading water; improvement in some areas, regression in others. It does appear, however, that there has been some improvement in closing the achievement gap.
Now that we have a new Superintendent, I believe it would be worthwhile to look at these test results differently than in the past. In recent years, most discussions of CMT scores have focused on the percentages of students in different categories and then comparing these percentages to past years. For instance, if the percentage of fourth graders at the proficiency or goal levels increased, the district would conclude that there has been improvement.
But this type of conclusion is superficial and can mask trends and accomplishments among students. While it is a necessary first step, it barely scratches the surface of what’s going on in our schools. In fact, relying on this type of comparative analysis — which is, in part, a consequence of federal law — can create a variety of problems. New York State provides an excellent example of what can happen if test scores are viewed exclusively through the framework imposed on the nation by the federal No Child Left Behind legislation.
According to federal law, districts are judged by the percentages of students moving from one category to another, such as basic to proficiency, or proficiency to goal, and very little else. The key for states, districts and individual schools, if they are to remain in good standing with the federal government, is to have a certain percentage of students above proficiency. How much students actually improve over the course of a year — generally called a Growth Model — has not yet replaced the federal framework, even though New York City has been piloting such a model for a few years, and with excellent results.
But back to New York State: Several years ago, state education officials noticed there were extremely large clusters of students just below the proficiency level on the reading and math exams. So they adjusted the cut downward; that is, they made it easier to reach the proficiency level. Scores, of course, improved drastically across the state and many schools were able to avoid federal sanctions, even though students were not learning more.
But the problem didn’t stop there. New York, like Connecticut and other states, does not change the types of questions on its standardized tests from year to year. This is done to facilitate valid comparisons over time. However, as districts, schools and individual teachers inevitably become familiar with certain types of questions, they are better able to prepare their students and scores go up. The tests administered last year in New York and Connecticut have not changed much in the last five years.
As information leaked out regarding New York’s testing procedures, outside experts began to examine what was going on. One examination concluded that because of the decision to lower the cuts (which, by the way, also happened in other states), New York’s proficiency level was actually lower than the basic level on the federal government’s main test, called the National Assessment of Educational Progress. And recently, a study by Harvard commissioned by New York found that the recent improvements were “illusory.” As a result, New York officials have announced they will recalibrate how they score their tests.
(Two or three years ago, thousands of New York students answered the same number or a few less questions correctly compared to the previous year’s test, but still managed to move from basic to proficient. To put all this in perspective: Twenty years ago, New York elementary students who scored below the 20th percentile nationally in math were usually held over. In 2009, according to Harvard researchers, students who scored roughly in that same percentile were counted as proficient.)
Back to Norwalk: I do not believe state officials adjusted the cuts to artificially raise scores. But nonetheless, relying on the percentages of students in different categories sheds little light on student achievement. What the district needs to do is perform detailed analyses of the raw scores, or what are sometimes referred to as scale scores, of our students.
By focusing on this type of data, which I assume is available, the BOE can examine the progress of individual students, cohorts or subgroups as they move through the system. It also provides information about the effectiveness of schools and teachers. Put in more traditional terms, the current system of scoring does not fairly distinguish between the student who goes from 69 to 70 and the student who goes from 40 to 69. The former is applauded for reaching proficiency; the latter is (statistically speaking) deemed a failure.
The BOE spends months on the operating and capital budgets. In contrast, the academic achievement of our students, as measured by standardized tests, is discussed one or two times during a school year. It would probably prove productive, not only for Board members but also for the general public, to examine these test scores with the same degree of diligence that is reserved for the budget.