Recently the Dallas ISD Administration presented to the Board of Trustees a table of accomplishments from the STAAR results and several Trustees took to social media to promote the same. That same week I sat at my daughter's elementary school where the Executive Director presented our STAAR accomplishments.
Though the state configures the data in a way that Approaches includes Meets and Masters, and Meets includes Masters data, I prefer it separated and present it as such in the second chart.
This looks like progress. If so, is it meaningful?
The first thing to know is that the STAAR exam is based on an ~ 2200 point scale. I attended a presentation by testing expert Dr. Walter Stroup where he explained the issue with expanding point scales. A 40 question test using a 100 point scale is fairly transparent. But with STAAR, take the same 40 question test, now small differences in terms of number of items correct are stretched out over a huge scale to make differences seem significant. He referred to this as making mountains out of mole hills.
The key questions to DISD are these:
In terms of number of items correct, what does this difference represent?
What fraction of the variance is sensitive to differences in instruction?
I've asked the District these questions and will let you know when I get a reply. It might be that there have been significant gains. Again if so, what does this really mean?
The insinuation by the District with their STAAR chart is that the standardized tests are sensitive to school inputs such as teacher quality, which they are not. Quite the contrary, an article published in the journal Statistics and Public Policy stated:
[The] teacher-explained share of overall variability in standardized test score gains is estimated as low as 3%...with the most recent published estimates being in the range of 1-14%. (Pivovarova, et. al. 2016)
So that means 86-99% of test score gains are due to factors OTHER THAN the teacher. The dominant other factor unfortunately is test prep.
In this era of test-based accountability, I can't help but wonder, is DISD pressuring its teachers to improve STAAR outcomes?
That would be a travesty, because clearly STAAR does not measure what matters.