Is Using Test Results a Good Way to Compare Schools?
In this age of accountability, test results, and cool comparison tools, it is all too easy to reduce schools to test scores. But is this really a good way to compare schools?
After all, that test score is only the result of one test on one day.
We intuitively know that schools are made up of more than test scores, but we don’t yet have measures available to complete the picture of achievement.
And if we know that, then why do we fall prey to the easy comparisons? Precisely because it’s easy.
Test results certainly have their place. Test results should start conversations. But before we enter into the new round of accountability measures, we need to ask ourselves how to best use test results and whether test results are a good way to compare schools.
Rankings and Ratings and Reports
Rankings and ratings typically use test results as a sorting tool. I published this a few months ago, giving a whole range of comparison tools that are available, complete with the warning that all of those rankings and ratings are pieces of what happens in school.
Where test results are concerned, the two national biggies are SchoolDigger and GreatSchools.
SchoolDigger.com utilizes only test scores for their ranking system. And they literally rank schools one against the other, ending up with a list numbered 1 to whatever.
GreatSchools.org utilizes only test scores for their 1 to 10 rating system for Alabama (cause that’s all we have available), so while they aren’t ranking and comparing schools one against the other, they are making a statement as to whether a school is below average (1 to 3), average (4 to 7) or above average. Additionally, GreatSchools allows community members to rate their schools. Anonymously.
The problem with all of these, of course, is that rankings and ratings are just too easy to access and if you don’t understand what it is that is being rated and ranked, you can easily misapply the information.
The Public Affairs Research Council of Alabama (PARCA) created reports using an excellent color-coded system to allow you to quickly determine which schools test results were above or below both system and state averages, by subgroup. Subgroups used were White, Black, Poverty and Non-Poverty students. The purpose was to create a way for communities to start a conversation to find ways to improve student performance.
PARCA didn’t rank or rate schools. PARCA reported results using a benchmark they reasoned to be equivalent to the National Assessment of Educational Progress (NAEP) definition of proficient.
What Test Is Being Used Matters, Too
If the test isn’t measuring what you want to know about student achievement, then it’s all a waste of time.
The Alabama Reading and Mathematics Test (ARMT) was the standardized test previously used to determine whether a school made Adequate Yearly Progress (AYP) each year. The total percentage of students scoring at a Level III and IV determined whether a school made AYP.
The ACT Aspire replaced the ARMT beginning with the 2013-2014 school year and those results are due out some time this fall.
We’ve been warned that test scores may drop due to the change in tests and the change in rigor of the tests.
Will we be able to resist the urge to point fingers and flail around if numbers do, in fact, come in low?
How Test Scores Are Being Used – And Why We Need to Ask Ourselves If We Should Use Results This Way
Here in Hoover, the community has been shocked to learn that the superintendent has a plan to shift batches of students from one elementary school zone into another elementary school zone. And we’re not just talking one elementary school…we’re talking lots of shifting and reshuffling.
One of the initial concerns that was raised was the perception that some elementary schools in Hoover are more successful than others, achievement-wise. Test scores were mentioned. SchoolDigger and GreatSchools were mentioned. In the spirit of full disclosure, more than a decade ago, low test scores at my children’s neighborhood elementary school brought me to the table.
Knowing Hoover’s test data the way I do, it got me wondering about how much of a difference there really is between elementary schools in Hoover. Where are these perceptions coming from? What would the numbers actually show?
So I created the following data visualizations of ARMT results for Hoover’s elementary and middle schools for testing years 2008 through 2013. Each year’s results are from the spring testing period, i.e., “Testing Year 2012” shows the results for students tested in the spring of 2012, which relates to the 2011-2012 school year.
Look through the various visualizations. There are maps with test results. There are stacked bars, showing what percentage of students scored at which level of the ARMT. They’re broken down by subgroup, subject, grade, and school. The scatter plot allows you to visually view the range of results, and is plotted alongside the percentage of students enrolled at the school who were eligible for free- or reduced-price meals that year.
What do these numbers tell you? What questions are you asking yourself as you page through these visualizations? What do the numbers not tell you?
[Tableau is doing server maintenance until 8:30 p.m. CDT on Thursday, August 21. You will not be able to interact with these visualizations while server maintenance is ongoing.]
So, after looking at the numbers from various visual perspectives, what do you think? Is using test results really a good way to compare schools? Tell me what you think.