Why Big Ideas Fail To Scale—And How To Fix It with John List: Big Brains podcast

Show Notes

Solving problems like poverty, education inequality or discrimination require policy interventions that can scale, but they rarely do. Why do some scale, while others have little success? It's not luck, it's not skill, it's actually a scientific method—at least, that's how Prof. John List describes it.

A world-renowned economist at the University of Chicago, List has helped scale some big policies and technologies as a former White House chief economist and the chief economist for both Uber and Lyft. Through his experience, he's observed a thing or two about what not to do.

In his new book, The Voltage Effect: How To Make Good Ideas Great And Great Ideas Scale, List lays out five key factors that people should consider when thinking through any idea, policy or product. He explains how shifting from policy-based evidence to evidence-based policy can solve some of the world’s more pressing issues.

Subscribe to Big Brains on Apple Podcasts, Stitcher and Spotify.
(Episode published February 17, 2022)

Transcript:

Paul Rand: Poverty, education inequality, public health disparities. These are just some of the problems that social science has been trying to solve for decades.

John List: We’ve been attacking poverty for 50 years.

Paul Rand: And yet after all that time, they’re still with us.

John List: And it’s not for lack of ideas because in the academy we are replete with ideas that work in the Petri dish.

Paul Rand: That’s the University of Chicago economist, former White House chief economist, and chief economist at Lyft, John List. The last time John was on our podcast, he ended with the cliffhanger. His next project was solving the most maddening mystery in social science. Most of our great ideas, proven research, and policy interventions completely fall apart when we try to scale them.

John List: For me as an economist, I choose one community and I figure out what works in that community. I want to scale to all of the communities around Chicago, or all of the communities around Illinois, or the Midwest, or America. How confident can we be in the initial results actually manifesting at a larger scale when we go bigger and bigger and bigger? If we don’t explore whether those great ideas from the beginning will actually scale, what do we have?

Paul Rand: After years of research into this problem, List is back with a solution in the form of a book, The Voltage Effect, how to make good ideas great, and great ideas scale.

John List: Typically we have a voltage drop. These results look great. In the initial study, it looks like a mountain. But then when I scale it, it turns into a molehill. That’s the voltage effect

Paul Rand: List says we need to revolutionize our entire system of research and policy making to focus on voltage, or scalability.

John List: What happens in the medical community is you have phase one, phase two, phase three. We don’t do that in the social sciences. We publish an idea, and when it gets published in a journal, it’s almost as if it’s magical. Policy makers and business people say, “It’s published in a peer-reviewed journal, it must be the truth.” What I’m saying is it might be the truth. It might not be. But it’s certainly not telling us, when we scale it, whether it’s the truth or not.

Paul Rand: From the University of Chicago podcast network, this is Big Brains, a podcast about the pioneering research and the pivotal breakthroughs that are reshaping our world. On this episode, the science of scaling social policy. I’m your host, Paul Rand.

Paul Rand: List has been using outside-the-lab economic studies to identify reliable and replicable interventions to social ills for decades. But it wasn’t until after one of his most ambitious projects in 2008, that he started to realize exactly how acute the problem of scaling these ideas actually is.

John List: There’s a community called Chicago Heights. And it’s a community that the world’s economy has left behind. This is a manufacturing community, and the manufacturing jobs have left. You have many buildings that are boarded up. You have homes that are in shatters. So they reached out and said, “John, can you help us?”

John List: The first shocking thing to me about the community is that for every thousand kids who start high school, only about 480 of those kids will graduate from high school. So we started doing some high school experiments, and we had some luck. A little bit here, a little bit there. But what we realized was that you can’t be a big change in a child’s life if you approach them when they’re 15 and they’re reading like they’re seven years old. Or they’re doing math like they’re eight years old. Their potential is long gone. And it’s a shame. It’s a real problem that we have.

Paul Rand: List and his collaborators wanted to see if they could change the trajectory of these left-behind kids through unique educational methods early in their lives.

John List: And this is a program that has started with Roland Fryer from Harvard University, and Steven Levitt of Freakonomics fame.

Paul Rand: So naturally they decided to create their own preschool.

John List: So Roland and Steve and I start this preschool from scratch. So as you can imagine, I’m the one on the ground, and I’m doing all of this work,

Paul Rand: Probably literally in some cases.

John List: Literally, exactly. Sometimes I was passing out on the ground. So we opened our doors in 2010. 2014 rolls around, and the results are great. They just look fantastic. I mean, these kids are killing it.

Paul Rand: Their intervention worked.

John List: Here comes the slap in the face, Paul. So I now go and start to talk to policymakers. The first response is, “Professor, your results look great, but they’re not going to happen at scale?” I’m like, “What?! I’ve been doing field experiments for 20 years, 25 years by this time. What are you talking about?” “You know, it just doesn’t have the silver bullet.” I’m like, “Well, whoa, whoa, whoa, whoa, whoa.” What is this silver bullet? And where do I find a few dozen of them? Because I want to change the world.

Paul Rand: Right. Right.

John List: But where they were exactly wrong is that it’s not a silver bullet problem. In fact, it’s quite the opposite. It’s what I call an Anna Karenina problem.

Paul Rand: You don’t get many economists to quoting Tolstoy here.

John List: Look. The best, the very best, opening line ever in a novel: “Happy families are all alike. Each unhappy family is unhappy in its own way.” So you can think about scalable policies are all alike. Each unscalable policy is unscalable in its own way. But my book unpacks the five major reasons why a policy is unscalable. And in that form, this is not a silver bullet problem. It’s a weakest link problem.

Paul Rand: So List went to the academic grindstone to see if he could uncover the science of scale. He poured over research and tore apart successfully scaled ideas, filling chalkboards with opaque equations, and graphs, and tables that explain the secrets of scalability. But he’s managed now to boil it all down to five understandable weak links, that if research or a policy can avoid these, it’s likely they will scale.

John List: Now the first one is false positives. There are cases where there was never any voltage in the first place, though it appeared otherwise.

Paul Rand: The best way to understand how this can happen is to look to history.

John List: Turn back the clock. September 14th, 1986. Nancy Reagan decides that she wants to take on the problem of drug use amongst teens.

Nancy Reagan: Not long ago in Oakland, California, I was asked by a group of children what to do if they were offered drugs. And I answered, “Just say no.”

John List: Now this is really the birth of the DARE program.

DARE Speaker 1: Yeah. To keep a kid of drugs, yeah.

DARE Speaker 2: Drug Abuse Resistance Education is a prevention program that works.

John List: I can still remember in high school, an agent came in and told us about how bad drugs were, et cetera.

DARE Speaker 2: DARE is based on the belief that a young person will choose to avoid drugs and alcohol abuse if they are given both the right information and the skills they need to resist the pressure to use illegal drugs.

John List: I looked at my school teacher and said, “I don’t use drugs, but I have a lot of friends who do. No way this will work.” Teacher looked back and said, “John, I hear you. But they say they have data.”

John List: They had this really nice study from Honolulu that had like roughly 1,777 kids. It showed signs of working, but the problem was it never replicated, and they never tried to replicate it before they rolled it out. So this is a program that sounds great, looks great, but the data were lying.

Paul Rand: And this really illustrates the huge costs we pay when we don’t confirm voltage before scaling.

John List: We spent millions, hundreds of millions of dollars on this program. We spent many, many people hours.

Paul Rand: The cost can be especially dramatic when the government scales false positives.

John List: Once you scale an idea in government, it’s really hard to take it back. By the time you figure out, “Well, you know what? This thing isn’t really working,” that’s five or 10 years in. There are entrenched interest groups. Some of them are getting paid to do it. And then they lobby and fight not to take back that program.

John List: Second. Another reason why the stakes are so big is because if we roll out a program that doesn’t scale, that means that we’re foregoing rolling out a program that could have scaled because we tend to prioritize projects and we can only roll out a finite number of ideas at a time.

Paul Rand: So the data. In this case, they really didn’t check the data. They expand it out further. You’ve also talked in this instance that sometimes the data’s not lying, but maybe the researcher is. Is that actually a thing?

John List: Yeah. You know, it is a thing. A paper that I wrote 20 years ago was a little bit prescient. It’s called “Academic Economists Behaving Badly.” So back then I did some work on what fraction of the survey respondent’s work was fabricated. And then I asked them what fraction of papers in journals do you think have data that are fabricated? And I received really high numbers.

Paul Rand: Oh my gosh.

John List: Yeah. For their own research I received between four and 5%. And for research in the academic community, they said seven or 8%. So to me, that was alarming. And now with the replication crisis, upon us...

Paul Rand: For all the nonacademics out there, the replication crisis is one of the most concerning trends in social science today. Basically, we’ve begun to discover that many findings which become foundational truth, turn out to not replicate when the experiments are tried again.

John List: The replication crisis is part of my first vital sign, because that’s really what we’re talking about here, is if you have an initial result and it’s not replicable, whether at scale or in a different Petri dish, that’s a problem. Because we’re not generating solutions that can change the world. If we would’ve had in place a system like medical trials, we wouldn’t have that problem. So we have to fend off both, let’s say, the nefarious types that I talk about, the dupers. But also the non-nefarious types. And they both happen.

Paul Rand: Okay. So not replicating results to look for false positives, either from lying data or lying researchers, that’s weak link number one. What’s weak link number two?

John List: Typically we start out our ideas by exploring one kind of population. A lot of times scientists won’t tell you this, but they do an efficacy test. So they’re really giving their idea it’s best shot. So they’re choosing people who it might work for. They’re choosing situations that it might work in, and then they write their academic paper. But then they forget to tell us that it was an efficacy test.

Paul Rand: There was a story that you talked about, Joseph Henrich and clinical WEIRD people that illustrates this a little bit.

John List: No, absolutely. So in the mid 1990s, Joseph Henrich, he’s a bright PhD student in anthropology. And for his field research, he goes out to Peru to conduct some work with the Amazonian community.

John List: So he decides to run some behavioral economics experiments to explore whether our results that people were finding in the lab. And what I mean in the lab is in the Western society schools, the sophomores in college. So he starts to find results that are at odds with these college sophomore results that people had been talking about for decades.

Paul Rand: We often underestimate how our differences can shift scientific findings that seem rock solid, especially when we get excited about interventions that could save lives. An anti-poverty program that may have positive results in Appalachia shouldn’t be scaled to villages in Africa.

John List: Are you choosing a weird population that will give you the result that maybe you’re searching for? And it really does not generalize to the non-WEIRD populations. Because let’s face it, the non-WEIRD population, that’s the majority of the world. Right? So when we find a result with the WEIRD groups, who’s to say that result is going to work when we take it to sub-Saharan Africa, or take it to Asia or wherever. Right? And this is all about understanding your population, and whether it’s representative of where you want to target your insights.

Paul Rand: Okay. Well, this isn’t limited, of course, to policy issues like this. Because corporations with all of their money, i.e. McDonald’s, also have things like this they could run into, don’t they?

John List: Yeah. Yeah. So the McDonald’s story is, turn back the clock to the mid nineties and McDonald’s is talking about putting out a new sandwich called the Arch Deluxe.

McDonald’s Ad Speaker 1: You really want to get to McDonald’s today? Two words.

McDonald’s Ad Speaker 2: Arch deluxe.

McDonald’s Ad Speaker 3: Care to join us?

McDonald’s Ad Speaker 2: Introducing the burger with the grownup taste. McDonald’s Arch Deluxe.

John List: They bring in a focus group and that focus group tells them, “This is a great burger. We love it.” So they roll it out, and guess what? A huge flop. A huge flop, because their focus group... And a lot of firms do this. A lot of firms bring in a non-representative focus group and they end up thinking, they dupe themselves into believing that that slice of the pie will be representative of the broader population. And in this case, that cost the CEO of McDonald’s his job, because they lost millions of dollars. Yeah. People wanted a hamburger, or a filet of fish, or a Big Mac. They didn’t want the Arch Deluxe.

Paul Rand: You know, related to this is your third week link, which is this idea of whether that success depends on a unique set of circumstances that can’t be replicated. And we talked about this in some of the earlier examples, but what does this mean, that can’t be replicated?

John List: So the example that I use is, “Is it the chef, or is at the ingredients?” Part of my research looked at restaurants because restaurants always try to scale.

Paul Rand: List uses the example of famous British chef, Jamie Oliver, and his chain of Italian restaurants.

British Speaker 1: At his peak, the total number of Jamie’s Italians in Britain had grown to 42 and it was turning over 108 million pounds a year.

John List: You set up one restaurant, the chef is there, the master chef, and you’re killing it. And then you try to go to 10 restaurants or 30 restaurants or 50 restaurants.

British Speaker 2: Today from the south coast through to Scotland, the closed signs went up. A chain carrying just one name now leaving a thousand out of work.

John List: If the original success was due to the chef and the chef was so unique that there is no way that you can replicate that chef at all of these other restaurant, you’re done. You’re done because humans don’t scale. And it’s very difficult to train humans to have a unique element.

Jamie Oliver: I was good at running one restaurant, but I wouldn’t call myself a businessman.

John List: Unique is important here. So think about Chicago Heights. What I did in Chicago Heights is I hired 30 teachers in the exact same way that Chicago Heights would hire them. Because I was thinking, “Look, I don’t want superstar teachers, because I don’t want to test a program with superstar teachers. I want to test a program and see if it will work with teachers that Chicago Heights could hire.”

John List: So that’s good if I primarily early interested in horizontal scaling. And what I mean by that is, I find a great result in Chicago Heights, will it work in Dayton? Will it work in Atlanta? Will it work in New York City? So I’m going across markets.

John List: But the other part of scaling is vertical scaling. What if I want a lot of preschools around Chicago? So now instead of having to hire 30 teachers from this market, I have to hire 30,000 teachers. Now I’m dead in the water because my program will work with 30 school teachers that Chicago Heights can hire. But if you hire 30,000 and you want to keep the same budget, I can’t hire teachers that good. So really if I was interested in vertical scaling, I should have also explored, “Does my program work with marginal teachers?” With the types of teachers I would need to hire if I vertically scale?

Paul Rand: List says this is going to require a reversal from thinking about evidence-based policy to policy-based evidence.

John List: Look, evidence-based policy is great because what it means is we’re using data to help inform decisions. That’s super. I mean three decades ago, we weren’t doing that. So congratulations. Good for humanity. Good for America.

John List: But that’s not the only step you need to take to scale and effect change in the large. You need to reverse that and say, “What are the constraints that I will be facing when I scale this?”, and bring those back to the original research design and say, “Does my idea still work with those constraint in place?” If yes, I can scale it. If no, change your program.

Paul Rand: The last two weak links and whether List’s work is getting real-world traction with policy makers and researchers after the break.

Rustandy Center Ad: Are you a nonprofit board member seeking actionable steps to accelerate your impact? Chicago Booth’s Rustandy Center for Social Sector Innovation offers The First 90 Days, a free nonprofit board toolkit that provides a roadmap for new board members. Learn more and download Rustandy Center’s nonprofit board toolkit at bit.ly/nonprofitboardtoolkits.

Paul Rand: Link number four is this idea of spillovers. So we’re whittling off the things that you have to make sure don’t exist. What is spillover, and how does that affect scaling?

John List: Spillovers are so multidimensional. Let me start with the simplest one and that’s called the Peltzman effect. So Sam Peltzman is a Chicago professor, an old... When I mean old, I mean it in a very nice way. A very mature, wonderful human being, and brilliant. So Sam starts doing work back in the sixties where he looks at the effect of mandatory seatbelt laws. And those mandatory seatbelt laws are, of course, meant to save lives. Back then, motor vehicles didn’t have seatbelts. Some of them didn’t even have seatbelts in them.

Paul Rand: Because it would tell us the cars were dangerous.

John List: Exactly, exactly. So you drove a little bit safer. But when they put seatbelts in them, what Sam noticed is people started to drive more dangerously. That was an unintended consequence. That’s one kind of spillover.

John List: Here I’m going to go to my days as chief economist at Uber. And there was something really bad that happened to Uber back on January 27th, 2017.

Uber Speaker: For Uber 2017 has been a year filled with controversies and speed bumps. The ride hailing giant in January kicked off the year on a sour note, reportedly losing 200,000 customers after Uber refused to strike during the protest against President Trump’s proposed travel ban.

John List: What it resulted in was a #Delete Uber campaign. It went viral and it basically killed Uber in terms of a lot of drivers left, and a lot of riders left, for Lyft. Travis comes to me, the CEO and founder of Uber and says, “John, you and your team need to get the drivers back.” I come up with the solution. We need to add tipping into the app. We beta test it by going to some markets and giving 5% of drivers the right to receive tips in the app.

John List: We find great results. The drivers make more money and they work more. So it’s like win-win. Then we scale that up to not only 5% of the drivers, but all of the drivers in the market. And then what happened was something, according to economic theory, magnific. Because they worked more, so the labor supply curve shifts out [inaudible 00:21:12]. And it shifted out so much that drivers were driving around with empty cars more often, and that effect undid the entire positive wage effect of tipping.

Paul Rand: The spillover effect.

John List: The spillover effect. And it’s what economists called the general equilibrium effect.

Paul Rand: Of course, sometimes these spillovers can actually be positive. The important thing is to make sure you’re constantly vigilant for spillovers as you scale.

John List: At this point, all of the Chicago economists who are listening to your show, Paul, are pulling their hair out because they’re like, “Oh my God, where is the Chicago economics?” This is a market. You have demand. And you have supply.

Paul Rand: Which brings us to the final weak link. The supply side economics of scaling.

John List: What’s interesting is when I started working in the area of implementation science, which is a new area, and it tends to be psychologists who are working on the scaling of ideas. I said, you know, I want to add economics to this world. So we started writing economic models and doing economic data analysis. And when I introduced the supply side, people were really surprised, and they said, “Wow, we’ve just been focusing on benefits, and whether there was a voltage drop in benefits.”

John List: When you talk about public policy decision making, it’s benefits and costs, and it’s something that literature entirely ignored. So I go after it here and it’s really just unpacking Econ 101.

John List: Look at your idea. And then look at the supply curve and what the marginal cost curve looks like in terms of, as you grow, does it become cheaper and cheaper and cheaper to produce? If so, let’s check off that box, because that has great economies of scale. If not, now you have to consider something very important. Is there a way I can change the production process? Is there a way I can change the good itself? Is there a way I can change the program to where I’m not going to have this steep rising cost?

John List: An example that you can come back to is Chicago Heights. When I go from 30 great teachers to 30,000 great teachers, something has to give. If I want to maintain quality of the teacher, I’m going to be going up the supply curve. Why? Because I’m going to have to take people from a trading desk at the Merc. I’m going to have to take people from Wall Street, the Silicon Valley types. Why? Because that’s the only way I’m going to be able to keep the quality high. And what am I going to have to do to attract them? Raise their wage.

John List: That’s a supply side problem now. So if my idea relies on a wealth of human capital, or really great people, get ready because you have to spend more and more money, and that can undo your entire voltage effect. Because remember, the voltage effect is as much about the supply side is it is about the demand side.

Paul Rand: As we’ve shown the scalability of policy has real changing implications. How far would we have come in eradicating poverty, public health inequality, or discrimination if we focused on scalability from the beginning? So the question is, will List’s work be taken to heart by policy makers?

John List: Boy, oh boy, too early to tell. What I can say is I’ve probably now given seventy talks since September around the book. The reception has been pretty warm amongst VC types, VC types that say, “Wow, I never thought about it like that. We’ve always thought about scaling as these kinds of numbers and then a heavy dosage of art.”

John List: I’ve presented it to firms and CEOs. So far so good. I presented it to places like the World Bank and various governments, governments like the Chilean government, and the Aussie government have been receptive so far. So it’s too early to tell, but I think it has a chance.

Paul Rand: And what about at the root of these issues? The people whose work policy makers used to identify programs, does List expect his work to gain traction amongst researchers and academic circles?

John List: Yeah. I think where I’m going to have traction right away is that if you’re going to say my research has policy implications, I’m going to be holding people because I show you how you can, you need to envision the constraints at scale and bring those back in your design. So now I’ve given them the map to how you can be sure to say, “I have policy implications,” because you can test that now in your experiment.

John List: I’m going to need funders in particular to hold people, foundations, hold people to the fire and say, “If you really want to change the world, your idea has to check these boxes.” And then with journals, that’s starting to happen more and more in economics, is that replications becoming more and more important.

John List: I’m starting a new journal called the Journal of Political Economy Micro, which will have a replication paper in it in every issue. As we more and more open up to the top outlets, the top academic outlets publishing it, I think then there are more incentives to do it.

John List: So I think you need funders. You need journal editors and you need the profession saying, “This is a good thing to do.” We’re moving in that direction, but it’s much, much slower than it should be.

John List: So the voltage effect, I believe will be in the end, something akin to an economic law. So in economics, we’re not as cool as the hard scientists. We don’t have quantitative laws. We tend to have qualitative ones. You know, we have law of demand, law of supply. I think the voltage effect law, which is, when you scale it, it will change, and I can give you predictions about how it will change based on the signatures of the idea itself.