Attempting to close educational gaps! Part 2: The Program

7 min readNov 20, 2020

Update: since I first wrote this post, I came across a great book by Zaretta Hammond: Culturally Responsive Teaching. It covers in detail many of the concepts I outline below.

In Part 1 of this series we described Lowell Elementary, a public K-5 school in Seattle where we are attempting to fully close educational gaps in math. In this post I’ll describe the actual program we are using and why we think it may work.

There are a surprisingly small number of programs and interventions that have been rigorously shown to improve math learning for K-12 students. We are trying to craft a program that combines three of the most successful methods in a way that uses the strengths of each to reinforce the others. We’ll start with a brief outline of the three individual components.

Metric alert: for the rest of the post I’ll talk about effect sizes in terms of standard deviations (SD). A gain of 1 SD in test scores corresponds to increasing a student’s normed score by about 64 percentiles.

Program Components

1. Regular Math Practice

This one is pretty easy to understand: the more kids practice math, the more they tend to learn. Many recent studies have looked at this in the context of personalized learning software (such as IXL, Khan Academy, or Dreambox), and they all show a pretty common-sense trend: growth in math skills is directly proportional the time spent practicing. However, every study I’ve seen also highlights a huge problem: most kids don’t practice math all that much. Our next two components help address that issue.

High-level effect summary: usage of ~30 min/week corresponds to gains of ~0.15 SD on standardized math tests [1,2].

2. Math Skill Development

You can’t just assign a 2nd grader calculus and expect them to learn it. One of the barriers to productive practice is the need to first learn the content. Many of the personalized learning tools provide some ways to learn the content, but studies have repeatably shown that one of the best ways to learn math is via small group tutoring. Our current program at Lowell is built around 1:1 tutoring to ensure that the students are learning the content.

High-level effect summary: 16 week tutoring programs of 90 min/week have been shown to produce gains of ~0.27 SD on standardized tests [3].

3. Relationship Building

You can bring a horse to water, but you can’t make him drink. Practice, and even tutoring, won’t work unless the student is motivated to learn. This is a reason for the recent interest in “mindset interventions” and many aspects of social-emotional learning (SEL). More generally, relationships are a core motivator of a lot of behavior, and having a strong relationship with a student will have a big impact how much they learn. This is a pretty new area of research and there is not as much established data compared to the other two components, but it is clearly important. We are trying to intentionally build strong relationships to help motivate student learning.

High-level effect summary: interventions that give students more of a “growth mindset” have shown gains of 0.05 SD on the GPAs of low-performing students [4].

Now that we understand the how the interventions work individually, let’s look at how we’ll use them together!

Assembling the Pieces

Below is a diagram showing how the different components feed into each other.

Building strong relationships with the students is important for developing the **motivation** needed for both skill development and independent practice, while 1:1 tutoring makes sure the **skills** are in place for the students to work independently.

What does this means in terms of outcomes? If we assume the effects of each component simply add together, we’d expect a total improvement of 0.47 SD (0.15 + 0.27 + 0.05 = 0.47). To put this in context for Lowell, we show the actual average math scores for Lowell along with what they’d look like with a total 0.47 SD improvement in the chart below. An improvement of 0.47 SD would put Lowell near the top of the entire state when it comes to math scores.

Fraction of students meeting standard in the math portion of the SBA for every 5th grade class in Washington state, plotted as a function of the fraction of low-income students. Dot size is proportional to the number of students in each 5th grade class. Lowell is shown in orange. The green dot shows what Lowell’s proficiency rate would look like with an improvement of 0.47 SD.

Time for the Bad News

However, this may be a pretty optimistic scenario for a few reasons. The most fundamental is that we are assuming all students in the school see the same effects from the program, but clearly students that already have a very high chance of passing the SBA won’t see a big increase in their pass rate. To eliminate this issue, we’ll need to analyze student-level data, which we don’t have at the moment. However, even after accounting for differences at the student level there are still at least two other big risks to seeing our estimated improvements in math scores.

What are the risks?

The biggest risk is one inherent to all educational interventions: it seems really challenging for “real world” implementations to get the same effect sizes seen in academic studies. To quote a passage from a 2017 NBER Paper on “The Economics of Scale-Up”:

Most randomized controlled trials (RCT) of social programs test interventions at modest scale. While the hope is that promising programs will be scaled up, we have few successful examples of this scale-up process in practice.

Another risk is that “stacking” multiple interventions hasn’t really been studied before; it could be that the whole is less than the sum of its parts. At this point, we really don’t know for sure. In other words, that theoretical gain of 0.47 SD may end up being quite a bit less.

Ok, a lot of things seemed stacked against us…what makes us think that it may be possible to actually exceed our initial estimates of progress? Well, it turns out we have a secret weapon.

Our Secret Weapon: Academic Intensity

Ok, maybe this is less of a “secret weapon” and more of an old-fashioned “lots of hard work” strategy. The research shows that increasing academic intensity is the most reliable way to improve outcomes. A landmark 2006 Department of Education research paper on postsecondary outcomes puts it this way:

The academic intensity of the student’s high school curriculum still counts more than anything else in precollegiate history in providing momentum toward completing a bachelor’s degree.

I haven’t seen anything that suggests academic intensity shouldn’t also the most important predictor of success in elementary school. How does this apply to us? There are two ways in which we could dramatically improve the academic intensity of our components compared to the previous research:

We can increase the amount of independent practice. The studies on personalized learning software all show that gains increase with the amount of practice. What if we could get our students to independently practice for 60 min/week or more? That would significantly increase the amount of benefit they get from this piece of the program.
We can run our intervention for a longer period of time. The studies on tutoring mentioned earlier typically involve relative short (16 week) programs. If the gains increase with the length of the program, we could run the program longer. If we are willing to use summer months, or run the program across multiple years, it could translate in gains of 4x or more compared to previous studies.

What this means is that even if each individual program doesn’t have the effects we’d hope to see, there are knobs we can turn to make the pieces more effective. However, at this point we should step back and see if it makes sense from another viewpoint…

Is it realistic from an individual student perspective?

Instead of talking about average effect sizes and standard deviations, let’s just look at how many students are learning the math skills they need (or not). In the 2018–19 5th grade class at Lowell, there were 13 students that were proficient in math, and 26 that were not. Of those 26 students, our theoretical gain of 0.47 SD is equivalent to making 19 of them (~75%) proficient. In other words, is it possible to take 75% of the students who aren’t meeting standards and help them become proficient? We’ll dig deeper into this with subsequent posts, but initial results with our first cohort at Lowell suggests that this isn’t crazy.

What does it all mean?

Even using the best studied programs and interventions, there are a lot of unknowns in what we should expect to see with student outcomes. However, we have some reasons to believe that by stacking multiple programs together, and by increasing the academic intensity (relative to previous work), we may be able to grow students’ math skills at Lowell Elementary enough to make it one of the best performing schools in the state. We think this could help demonstrate that it is possible (and maybe even practical) to fully close educational gaps in schools, even those with the most disadvantaged students.

Interested in how things look after the first 6 weeks? You can read about it here.