2021–22 Program Summary

Mike Preiner
5 min readJul 28, 2022

The goal of The Math Agency is to close educational gaps in elementary school math for students from disadvantaged backgrounds. We do that via small-group coaching and family engagement. If you are looking for some more background, this Seattle Times article is a reasonable place to start.

We recently wrapped up our pilot programs for the 2021–22 academic year. When we ended in June, we were working with over 100 students across three different public schools. On average, students in our program doubled their previous academic growth rates. Students entered our program having learned an average of 0.6 grade levels worth of math per year. While they were enrolled, they learned an average of 1.2 grade levels of math per year.

To better understand what this means, let’s dive into the details.

This year we operated at three different Seattle public schools: Northgate, Lowell, and Olympic Hills. We also started a pilot at Cedar Park but decided to pause it after deciding it wasn’t a good fit. While many of the basic components were similar across the three schools, there were also key differences. The table below summarizes the basic pieces of each, along with important program parameters. It is worth noting that most of the students entered significantly behind grade level; this can be seen from their historical growth rates.

Program summary for our 2021–22 academic year program at three public elementary schools. Our growth calculations (and student counts) only include students who were in our program and fully assessed for at least 60 days.

There are a few things that stand out when looking at the programs across schools:

Both in-school and after-school formats can be effective. There are some natural trade-offs between these two formats: in-school allows us to reach every student in the school, while the after-school format seems to be more efficient (which isn’t surprising, given that it is all “extra” math). We plan to continue experimenting with both formats.

Remote coaching can work under the right conditions. The results from pandemic-related school closures make it pretty clear that fully remote learning is not a substitute for in-person school. However, our data shows that remote coaching can be pretty effective in increasing academic growth rates. We suspect that low student/coach ratios and creating strong student-coach relationships are particular important in having an effective remote format. It is worth noting that there may have been some self-selection effects in our remote pilot: students who enrolled into the program may have been more likely to be those who would stay engaged. Again, we plan to continue experimenting with this format.

Dosage matters. We’ll be discussing this more in an upcoming post, but we find a pretty strong relationship between how much a student practices and how much they grow academically. Increasing the dosage (the amount of time students spend in the program) is a clear way to increase growth. **Update: this post describes the relationship between practice and growth in our most recent cohort.**

Of course, this high level data raises some natural questions about the details. Let’s look at some of the biggest.

Are the growth measurements reliable? This fall, we compared the results of our internal assessments (using the Diagnostic component of IXL) to standardized test results, and the results were encouraging. We did the same comparison this spring and saw similar results, which we show below.

Spring MAP RIT scores versus IXL Diagnostics for 1st and 2nd graders enrolled in our program at two different elementary schools. The median absolute error of the linear fit is ~0.2 grade levels.
Spring SBA scale scores versus IXL Diagnostics for 3rd, 4th, and 5th graders enrolled in our program at two different elementary schools. The median absolute error of the linear fit is ~0.6 grade levels.

We can see a few things from the data:

  1. The general correlation between standardized tests and our weekly assessments matches what we saw in the fall. This gives us confidence when using the IXL data to measure student growth.
  2. There is a better match between IXL and MAP than between IXL and SBA. We also saw this in the fall, and it matches our recent analysis that showed there is more noise in the SBA assessment than the MAP.

The discussion of noise naturally bring us to our next question:

How variable are the results? While our average growth rate of students in our program was 1.2 grades per year, there was a lot of variability at the student level: some students showed negative growth while others grew over 3 grade levels/year. The full distribution of the results are shown below.

Histogram of the annual growth rate of the 114 students in our program who had been assessed for at least 60 days.

A key question when studying variability is whether we know what is causing it. In other words, do we know why some students grew at 2 grade levels/year and other at less than 1 grade level/year? The short answer is we think we understand some of the key factors; we’ll be discussing them in a future post. **Update: the future post is here!**

Is it enough to close gaps? Doubling academic growth seems like a good start. However, students typically come into our program 1–2 grade levels below where they should be. This means that it will take them a while to catch up if they gained “only” 1.2 grade levels per year. If our goal is to get them above grade level in one or two years, we’ll either need to add in a summer program (more on that in this post), or further increase our growth rates. Or ideally, both.

What’s next? Given that we’ve now shown significant increases in student growth at a moderate scale (over 100 students), there seem to be a few natural next steps.

  1. Measure the causal relationship between our program and student growth: we want to separate the impact of our program from other factors that could influence student growth. These factors could include a general “post-COVID bounce”, changes in curriculum, effects from individual teachers, or the selection bias discussed above. In a previous post we discussed an experimental format that should allow us to isolate our causal impact on students.
  2. If we can sustain our current growth rates and add in a summer program, we should be able to get a significant fraction of our students to grade level within two years (you see some modeling around that in this post). This would make our impact crystal clear in public assessment data.
  3. Keep improving the program! We have a lot of improvements in the works that we think will materially increase our impact on students. Of course, we won’t know for sure until we measure it:)

To recap, during the 2021–22 school year students in our program doubled their academic growth rates.

This included over 100 students in three different elementary schools. We have a lot of evidence that our measurements match up to other assessments of academic performance, and based on these numbers, we think it is possible to make significant progress in closing educational gaps at our partner schools going forward.



Mike Preiner

PhD in Applied Physics from Stanford. Data scientist and entrepreneur. Working to close education gaps in public schools.