Anarchy and Overregulation in American Education
A structural theory of America’s education dysfunction | Part 1
Theories of Progress is a series from Education Progress building the intellectual framework for durable education reform: why the system resists improvement, what excellence actually requires, and what history tells us about how to build it.
TODAY’S WORLD is a world of regulations. For better and worse, the rise of the administrative state — a “fourth branch” of government sitting uncomfortably between Congress and the Presidency — has been a defining transformation of the past century. For a country of America’s size, such a transformation would have been impossible without the technological and social progress that accompanied it. Modern, coordinated workforces cannot be built effectively from populations fractured by prejudice or poverty, and even the freest of societies would be worth little to their people if they lacked the power to resolve collective action problems, or synthesize the fruits of labor and innovation into what we now call progress.
But while our world grew faster paced, our technology more powerful, and our communications more rapid, so did our world also shrink. A kid from Oklahoma now might go to California for school, then to Washington to raise a family, and then finally to Florida to retire. Widespread techno-optimism ran into the toxic byproducts of its own wild successes, and we can now even cure the side-effects of being able to make more food than we could ever eat. We also went from a poverty of available information to drowning in algorithmic entertainment. These dynamics should neither be blamed solely on capitalism, nor on the now-discredited system called communism. Modernity itself is just a double-edged sword.
So too is the administrative state. Depending on your political priors, though, you might see regulations and the administrative state as either the primary drivers or arch-nemeses of progress. Personally, I think the story is complicated, and the answer is probably some mixture of “both and,” depending on the context. Every industry expert can provide horror stories of regulatory failures impeding progress, but the vast majority of regulators exist because our legislators — our infamously dysfunctional, self-effacing, slow-moving Congress — felt the need to pass a law so that the Executive could do something: do something about pollution, do something about discrimination, do something about food and drug quality. And despite all the controversies, setbacks, and abuses of power administrationism brought with it, it also facilitated progress. Incentives were created so that populations and programs could be served by market forces and government agents that might have had no reason to go “there,” or felt no responsibility to serve “them.” The result is that we live in a society our grandparents hardly could have imagined.
THE STORY is remarkably different for education. The theory I will try to articulate over a few posts, starting with this one, is that education has been uniquely misregulated — so much so that we’ve landed in a paradoxical position where the educational landscape should be understood as a system that is simultaneously anarchic and overregulated. Education is drowning in regulations governing licensing, accreditation, accommodations, civil rights, and funding, and yet so few of these touch what arguably matters most in schools, like whether a reading or math curriculum actually works as described, or whether a student is being placed in classes that best reflect their ability and need.
This paradoxical situation, I argue, can help explain why durable education reform has proven so fleeting, so ineffective, and the diagnoses or solutions so cyclical. To do so I’ll borrow a framework from one of my favorite international relations theorists, Kenneth Waltz, which he used to explain why wars happen. His key insight in Man, the State, and War was not merely that wars have multiple causes operating at different levels, but that reforms targeting only one level are structurally doomed to fail. The problem, he argues, is that the anarchic structure of international relations serves as a permissive cause of war: war happens because there isn’t a higher power that prevents it. War is not inevitable because human beings are inherently flawed, nor is it preventable by improving their individual characters. Neither can it be prevented by just making every state a democracy, or by making them all rich and interconnected through free trade. The problem is a structural feature of the system in which they operate.
Borrowing his frame, I sketch out three different “images” of education reform that map onto different parts of the landscape today:
First-image reforms identify practitioners, teachers, and students as the primary instruments of educational dysfunction, failure, or success, and thus frequently focus on constituting the right kinds of individual actors.
Second-image reforms view the problem in terms of how institutions are composed and arrange themselves, including questions over what types of schools exist (public, private, charter), or the kinds of resources schools, districts, students and families have at their disposal (funding, student populations, choice, etc.). Change the institutional arrangements and structures, and you can reform education.
Third-image reforms identify system-wide problems and propose correspondingly system-wide solutions, like top-down accountability and testing (No Child Left Behind being the quintessential failed example, and the Southern Literacy Surge reforms, a promising effort appearing over the last decade).
There is something to each of these three images of reform. But meaningful progress in education will not materialize if reforms fail to affect core parts of all three images. The crucial piece that’s missing — a national architecture of educational quality-control — would be a kind of third-image reform that, crucially, also spans the other two images, as I’ll try to describe below. Thankfully the situation is not as bleak here as it is in international relations, because anarchy is a contingent feature of our education system, not a necessary one. The illegibility coursing throughout is the result of a series of policy choices that were made or not made, and so can be fixed.
The rest of Part 1 will be spent sketching out this three-image theory of education reform, identifying the promises and shortcomings of each one in isolation, and then presenting an account of what the current system lacks.
Part 2 will examine the shadow powers that have filled the void left by a non-existent national education quality-control architecture, drawing parallels to Renaissance Italy’s politics and power-struggles to illustrate why certain pathologies appear so difficult to dislodge.
The final post, Part 3, will offer some ideas for navigating what I call the Paradoxical American Renaissance that we find ourselves in, and examine what it might take to build the kind of architecture that our system currently lacks.
The risk, of course, will be creating yet another system vulnerable to regulatory or ideological capture. Part of the point of this first essay, though, is that the problem runs deeper than an absence of some “educational FDA.” Rather, it is education’s lack of common vocabularies, agreed-upon methods, and legible aims that would make such a regulatory agency imaginable in the first place. The hope is that telling the story as a series of regulatory and institutional failures will illuminate what a possible solution might be.
Image 1: Practitioners and the Information Void
Teachers and students are, intuitively, the two most important parts of the education ecosystem, and so most efforts to improve the schools naturally start in the classrooms. This means that first-image reforms get a great deal of attention in the education reform space.
Many of these reformers want to remake or reconstitute the practitioner in some significant way, whether that’s freeing teachers from the “tyrannies” of direct instruction, making them “guides on the side” instead of “sages on the stage,” or making them sufficiently “culturally proficient” educators. Other reformers try to remove the human variable as much as the idea of schooling can allow, whether that’s scripting curricula, ed-tech, or novel accountability systems. What limits both, however, is the information environment in which they operate: it is so generally degraded or hard to parse that, eventually, some incentive or another will start to produce and reinforce bad practice.
Practitioners work inside institutions that shape what reaches them (second-image), and inside a system lacking quality control and clear signals (third-image). And even if, from the first-image perspective, these infrastructural problems were fixed, a further, deeper problem would remain: one of the dominant pedagogical paradigms created by first-image reforms has immunized itself from the kinds of evidence such an infrastructure would even produce.
Compare the quality of the tools, certifications, and epistemic communities that your average practicing doctor has at their disposal with the tools available to a curriculum director at a run-of-the-mill public, private, or charter school. Doctors have meaningful board certification tests and standards; the FDA communicates the evidence-base via product labels, reports, and guidelines; doctors have to buy insurance and malpractice liability; and medical schools are a real grind. A curriculum director, in contrast, has a handful of disconnected curriculum reviewers, like EdReports; they get bombarded with vendor marketing materials, without any common language for efficacy, review standards, and so on; and the networking, conferencing, and professional development space is almost entirely unpoliced, and frequently dominated by education school celebrities.
First-image reforms run up against four different limits, which I’ll discuss in order. These are:
The evidence doesn’t reliably reach the practitioners,
What reaches them is often wrong,
When the right information exists locally, institutional dysfunction and gatekeeping block it, and
One of the most popular pedagogical orthodoxies actively degrades the information environment and immunizes itself from evidentiary critique
1. The evidence doesn’t reliably reach the practitioners
There is, despite everything, a robust and growing evidence base for how learning happens. (The best introduction to this work is, unsurprisingly, called How Learning Happens.) The evidence tells us that methods like retrieval practice, spaced repetition, and direct instruction reliably produce greater learning gains than the inquiry- or discovery-based alternatives. The problem is less that this evidence is missing, and more that a healthy pipeline from research to practitioner was never fully established.
The What Works Clearinghouse (WWC), housed in the Institute of Education Sciences, was one part of this pipeline that had been successfully built, though it never became what it truly needed to be. The WWC was responsible for producing the kind of independent evidence base which, in a system with functioning quality-control architecture, would be a constant point of contact for curriculum directors, teachers, and perhaps even the education schools. But what was built was not enough to pierce the ecosystem that, it turns out, largely ignored it. And now DOGE has been let loose on IES, although some in the administration are apparently realizing that the Education Department’s research and statistics wing might not be a wise part of the government to cut.
I think WWC and the IES needed a lot more resources, a lot more institutional teeth, and a much bigger PR campaign attached to it if it were ever going to do the things people expected it to have done by now. The main reason for this is because of a problem that lives upstream: the schools that credential the teachers are not doing a super great job. An investigation by the National Council on Teacher Quality found only about a quarter of the 700 teacher-prep programs taught all five components of evidence-based reading instruction. The general information environment for practitioners is fragmented, and as a result a lot of the best evidence never makes its way into the right hands.
2. What reaches them is often wrong
The absence of information is bad enough for practitioners, but what’s worse is that the information that does end up making its way through the pipeline is frequently bad, or unhelpful, or theoretically unsound (and potentially difficult to falsify). This discussion will be brief because we’ll return to it in the next two images.
Two recent news items in the education world show the problem well enough. First, consider EdReports, a quasi-independent curriculum review organization which, for years, has positioned itself as a kind of FDA for the K–12 curriculum market. The problem, as APM reported last year, is that EdReports’s ratings do not consistently track the actual evidence on how students best learn how to read. Other reviewers besides EdReports also suffer from key limitations or flaws. Karen Vaites calls the curriculum review landscape “frankly bananas,” which (besides being an awesome way to put it) is surely among the more polite things a New Yorker can say about the field.
State education agencies — another party you’d expect to be keeping a watchful eye over the first image — frequently embarrass themselves in ways that EdReports’s fiercest competitors could only dream of. The New York State Education Department (NYSED), for example, released a work of numero-arithmatic fiction as its official guidance for how teachers and schools in the state should practice and think about math instruction. They’ve been ignoring a devastating petition for a while now; I briefly covered the situation last year, and my colleague, David, recently painted an even richer picture of the dysfunction that you can read right below.
The upshot, though, is the state endorsed a bunch of imaginary beliefs about student learning (that they don’t have different levels of mathematical talent [seriously]), math instruction (timed repetitive practices being maligned as a narrow tool for children that struggle), and assessment (that timed exams and quizzes create math anxiety, rather than merely revealing a lack of prior math instruction). Maybe you don’t think any teachers really take state materials like this seriously, and so this doesn’t matter. But perhaps it should.
3. Dysfunction and gatekeeping block the right information when it exists
Sometimes the information is there for first-image practitioners to use, but bizarre gatekeeping or boring-old dysfunction make using that information well impossible. First, I’ll share a story from my Aunt, who taught for many years at a public elementary school in Kansas City (edited from a phone conversation):
When we first started out [in the mid-90’s], we could do our own thing. We knew what our kids needed, and we could make it more interesting. We had basals, but we could adapt them, until they got rid of those and it was kind of like the Wild West for a couple years. None of us knew how to make our own curriculum, obviously.
But then with No Child Left Behind, they started making every teacher read from the same manual and rigidly follow all these state standards, and that pretty quickly abandoned phonics. All of those new reading techniques or wacky exercises were of course a disaster, but we weren’t allowed to deviate from them. So I would put phonics in our morning work, even if it was videos, or a brief overview of a word I knew they were about to encounter and wouldn’t be able to just guess.
But what the school was telling all of us teachers about phonics was “Don’t do drills — that puts pressure on the kids.” If everyone gets an award, but no one can read, then what’s the point of that?
They would also judge us on any given day about the state standards we were using that day — they had to be written on the board — and whether the students could recite the standards of the day. And so the principal or somebody from the central office would come in, inevitably pick on the worst-behaving kid to recite the standard, and that was supposed to mean that the students knew what was going on.
I’ll never forget there was a third grader I tutored who could barely read his own name — and the school kept asking me, “Why isn’t he getting on track? He has to pass the standards so he can go on to fourth grade,” and that always baffled me. He literally struggled to read his own name, and his parents were no help at all. Nothing can make up for parents not reading to their kids, or just sticking them on an iPad.
Toward the end of my time teaching we were also commanded to do iReady every day, and you’re seeing them getting sued now for sharing student data or whatever. But it was a totally messed up way to teach them the whole time. Tests 3 times a year, if kids weren’t moving up it was on the teacher, and I would see kids focus on the games that would pop up.
It’s just people that aren’t in the trenches who are making these policies. And it’s been a disaster.
There are probably countless stories like my aunt’s, and likely many more where teachers were not able to go around their administrators, or principals, or curriculum directors.
But sometimes the right information gets blocked because of some complicated nexus of systemwide dysfunction. Last year we published an article called “The Algebra Gatekeepers,” which describes how tens of thousands of high-scoring students in North Carolina were denied access to advanced math courses which, by their objective metrics, they were prepared for. Several forces came together to create this dysfunction. It turns out that the results of the tests weren’t readily available to teachers or parents, and that teacher recommendations prevented students from advancing, but without the reporting requirements to learn exactly why.
Many high-scoring students, for example, could have attendance or behavioral problems that make the environment in an advanced math course inappropriate for them. But many more were also being denied just because teachers felt like the class wouldn’t be the right “fit” for them. Incredibly, when the state passed a law to get the most proficient students into advanced math, the state board of education muddied the measurement system to get in the way.
4. Self-immunizing pedagogical theories resist evidence
The first three limits describe problems that the right infrastructure could, in principle, fix. What such an infrastructure couldn’t fix, though, is a situation in which one of the dominant pedagogical theories on offer — in academia and in the curriculum landscape — actively degrades the information environment and immunizes itself from evidence-based critique.
This family of constructivist pedagogical theories has been known by many names, but today it runs rampant in discovery learning, guided discovery learning, inquiry-based learning, culturally proficient pedagogy, and many other academic and curricular movements. Constructivists can be traced at least back to Dewey in America — one of the ur-theorists of the educational romantics — but today their standard-bearers are academics like Jo Boaler, Lucy Calkins, and Deborah Ball. These are the academics that, in order, inspired San Francisco to abolish algebra in middle school, convinced everyone that balanced literacy produced literate students, and drafted pseudoscientific guidance on math instruction for the state of New York.
I know the above seems to imply that all constructivists think alike, and more broadly that constructivist pedagogies are all kinds of pseudoscientific shams. I should emphasize that both of these claims are false. Some constructivist methods likely work great for some students, depending on their specific abilities, motivations, and social contexts. The same is certainly true for some programs, theories, or curricula that aim to produce more culturally proficient educators, or that claim to produce certain qualitative learning outcomes — like engagement, motivation, or conceptual understanding — more than their alternatives.
But right now, all of that exists more in the realm of folk wisdom, and less in the realm of science and evidence. Part of the problem is that a core feature of so much of the constructivist pedagogy attempts to perform a kind of double-reconstitution of the educational relationship. First, it reconstitutes the teacher: no longer a transmitter of content, but a facilitator of the student’s own construction of the knowledge. The evidence that this specific approach produces worse outcomes than explicit instruction has been, in my opinion, basically settled for over 20 years now. But the second reconstitution is where the dynamic gets even more intense: if knowledge is constructed by the learner, then who the learner is becomes, epistemically, the entire center of attention. The student’s culture, identity, and “lived experience” stop being background conditions and become part of what instruction is for — and in strong versions, it can become much of what instruction is about.
The double-reconstitution’s final act, in a logical sense, functions as a kind of self-inoculation against rigorous evidence. If the classroom is a site of identity constitution as much as it is instruction, then asking whether it is “working” in the ordinary sense not only looks confused, but frequently gets treated with moral suspicion. The demand for evidence isn’t refused on methodological grounds, and is instead positioned as hostile to the “true” task of education itself! And this is at least one part of the story of how constructivism exists today in the educational landscape.
The other part of this story, though, is the role institutions have played in it. Captured institutions are one reason why evidence-based practitioner reformers like Zach Groshell and Doug Lemov don’t control more of the teacher training pipeline. They are effectively swimming upstream against the tide of all the prestige academies, flashy curriculum providers, and lagging state standards.
But first a slight detour back to Waltz, to help explain how the second image is different.
Image 2: Institutions and their Vetos
Second-image theories in international relations locate the causes of war not in individuals, but in the internal structures of states and the way they arrange themselves. If all states became liberal democracies, according to some theorists, war would end — this is what’s known as the democratic peace thesis. Another second-image view says if states became economically interdependent, the costs of war would become prohibitive — what we can call “commercial liberalism.” Yet another says that if states federated and pooled sovereignty, the structural incentives for conflict would dissolve over time — this is Kant’s cosmopolitanism (the subject of my master’s thesis, if you’re interested). The key conviction across all these variants is that institutional arrangements and compositions are the skeleton key to international peace.
Second-image education reformers also believe the problem (at least mostly) has to do with institutional structure. There’s a left- and right-wing version of this conviction, and from the perspective of the second image, they’re just kinds of mirror images of each other. The left’s conviction is to concentrate power and resources at the implementation link — in unions, districts, credentialing institutions, struggling schools — focusing on things like labor conditions, funding, classroom sizes, and as we just discussed, romantic-constructivist pedagogical theories brewed inside the education schools. The right thinks the solution is to concentrate authority at the selection link — things like school/parental choice, charters, vouchers, and competitive or performance-based pay schemes for teachers, school officials, and school funding.
Both of these theories of education reform shift incentives, resources, and educational authority to particular actors that each thinks will tend to make the right kinds of decisions — if only they had enough autonomy! But both are doomed to fail for the same Waltzian reason. In a system with no quality-control infrastructure above the rearranged institutions, and no communicable standards available to those within or beneath them, any rearrangement merely becomes a temporary gain or loss, as institutions inevitably recreate old failures from new first principles. Doing so is hard to avoid because of three mechanisms endemic to this kind of system:
Supply chains lacking oversight. There aren’t meaningful mechanisms to verify the quality of a new curriculum, or the success of a particular pedagogical method — it reminds me of what advertising must have been like a century ago.
Veto points and institutional capture. The education system has an astounding number of veto points, whether that’s in the realm of regulation and oversight (federal government, state government, local boards, unions) or uncompetitive, parochial, and ideological research paradigms and theories (education schools). Any one of these veto points can stop a reform effort.
Metric gaming and signal degradation. This happens from the top down and the bottom up, i.e., in terms of the quality of the information flowing from the schools; or demanded from the schools by existing state oversight mechanisms; or advertised to the schools by researchers and curriculum providers. The rational response in such a disordered information environment is to degrade the signals, rather than address the systemic issues. This is the phenomenon of lowering cut scores, inflating grades, redefining proficiency, or pursuing abstract, confusing, and hard to measure results.
I’ll briefly discuss some cases where these mechanisms rear their heads.
Ed schools and the captured training pipeline
Part 2 will dive even deeper into the world of the education academy, but it’s important to mention them here, too. The academy is perhaps the most difficult to reform second-image institution, and yet if there had to be one party responsible for the durability of the romantic-constructivist paradigms we discussed above, it’s the ed schools.
Their stickiness, in part, comes from the fact that this captured institution sits atop the credentialing link in the education pipeline, before any other institution (state, district, school) has a chance to act. The self-sealing loop, radically oversimplified, works like this:
Ed schools transmit the romantic-constructivist paradigm →
Credentialed teachers and administrators1 staff districts, state education departments, curriculum providers, and education cultural engines →
State guidance and available curricula reflect paradigm assumptions (recall the NYSED math briefs!) →
Districts adopt aligned curricula →
When outcomes disappoint, everyone just moves back up two bullet points →
And so state guidance and popular curricula get rewritten by the same people, and we put another 50¢ in the pinball machines
This self-sealing loop ultimately persists because the credentialing institutions do not face meaningful external pressure to evolve into a mature, evidence-based profession.
I referenced that NCTQ study of the 700 teacher programs above to explain a limiting factor on first-image reforms. But the fact that only ~25% of the programs taught all five components of evidence-based reading instruction also evinces the capture and misdirection of the academic institutions as a whole. So does this study from 2020, which examined a flagship product from Columbia’s Teachers College, Units of Study, and found that it was not aligned with the best research on reading. This mismatch is reflected in the outcomes these graduate/credentialing programs generate in the trainees and the districts they teach in. Another NCTQ study, for example, found that budget-crunched districts frequently spend millions on master’s degree premiums that have no measurable impact on student outcomes (Brookings found similar results). Back in 2008 districts were spending ~$15 billion annually on these programs.
Expecting these institutions to reform themselves internally would be a mistake. To do so would require them to adopt a certain point of view — whether about evidence, or politics, or the point of education generally — that they preclude from the start by ideologically sequestering themselves. It starts with how different research paradigms get treated within the system. Many teachers even find that the academy’s obsession with equity crowds out topics more relevant to the profession, like classroom behavior or effective reading instruction. Scholars in other fields, like sociology, have already started sounding the alarm about how methodological stagnation can result from political and ideological echo-chambers. Such a reckoning for the education academy never seemed to stick, or really get going at all.
The ideological archipelago extends beyond the research, though, because education schools also produce tons of administrators and other kinds of quasi-public officials that control who enters the institutions and how they work on the inside. All of these forces combine to keep education in a state of immaturity compared to other fields. A mature field, as Douglas Carnine writes, is built on five pillars that education currently lacks: a shared knowledge base, research-aligned preparation, licensure rooted in competence, accreditation with teeth, and accountability for quality of practice. These pillars distinguish education from a field like medicine, and are part of the third-image architecture we will discuss more in the next section.
Two recent news items, though, perfectly illustrate what resisting outside reform pressures looks like in practice.
New York’s $10m reading training boondoggle
In 2024, Kathy Hochul signed “Back to Basics” into law, claiming that New York was “turning the page on how we teach students how to read.” After years of falling scores and amidst a growing national literacy crisis, $10 million was appropriated to redesign reading instruction and train around 20,000 teachers. This funding was given to New York State Unified Teachers (NYSUT) to run the training through their “Education and Learning Trust.”
But what NYSUT produced was a course filled with balanced literacy content, and multiple literacy experts have offered their critiques, even pointing out, as literacy researcher Isabel Beck put it, that the course rendered her work “backward.” The Hechinger Report article goes over all the details, but it’s a story ripe with second-image dysfunction: supply chains lacking oversight, odd veto points and captured institutions, and poor information signals about what trainings or curricula are actually doing what. It was particularly amusing to read the delayed response from the NYSUT (excerpted from the Hechinger article):
NYSUT advocates for structured literacy and science of reading-aligned instruction and practices. We do not advocate for balanced literacy in our course… [The course lets educators have] deep discussions around the shift from balanced literacy and why that’s no longer evidence-based.
Forgive me if all that seems hard to believe when read alongside a review of New York practices like the one offered by ExcelinEd. New York has only adopted 2 of the 18 fundamental principles that ExcelinEd uses to evaluate early reading programs across the states.
NYSUT’s advocacy apparently needs to step up its game. And the response from their official is disappointing, but it’s totally unsurprising given that the NYSUT will face no meaningful political pressure from all of this. So why would their officials be responsive to pressure or community concern when it goes against their ideological priors?
San Francisco’s decade-long algebra detracking disaster
If you want the full story on San Francisco’s terrible experiment in detracking middle school math, I wrote an article you can read here that goes over the full timeline, and the recent vote that only partially restored access to algebra for the city’s eighth-graders.
The story is filled with second-image dysfunction. From the outset the effort was backed by Jo Boaler, who brought her snake oil up from Silicon Valley (Stanford), right when the city started to reckon with how badly it was educating many of its students who were poor, or Black, or Latino, or just learning English. Boaler offered the perfect product at the perfect time — a different, romantic-constructivist approach to teaching math — promising that it would eliminate the gaps and oppressive sorting that traditional math instruction, by its nature, had created. She had Stanford’s imprimatur, after all.
But at the end of the day the policy proposal was to eliminate the public middle school option for algebra. Obviously this, by definition, eliminated some “gaps” that reformers had been pointing to — gaps like “how many more of these kids take 9th grade algebra than those other kids” — because that’s what not allowing anyone to take it in middle school is going to do no matter what. So the reformers sought other ways to show that the policy was working, and in the process the third mechanism — signal degradation and metric gaming — played a key role in their attempt to whitewash the experiment halfway through.
They reported increasing enrollment in advanced classes, but it turns out the boost went away when one class was properly categorized as, well, not advanced. AP enrollment also declined over the period, and just recently might be back to where it was before. And of course student proficiency gaps increased, which is the only gap that really matters at the end of the day. Tom Loveless and Kelsey Piper have both written excellent articles on the research manipulations and borderline malpractice that characterized the effort. (My words, not theirs.)
Last month longtime Superintendent Su and the SFUSD board finally voted to “bring algebra back to middle school,” but the new plan shows how captured institutions and a lack of oversight were also key mechanisms in the story. It took a decade for the reform to be reversed despite overwhelming public support for middle school algebra, and the historic successful recall vote on several board members in 2022 was for a whole lot more than just detracking middle school math. Because the school board was, ideologically, an ally of Jo Boaler, the last institution that could have policed the curriculum pipeline — the board — failed to do so. It’s also why the board and other entrenched groups are so inexplicably stubborn about running a normal math course track in the district. All the surrounding districts do it, but the new plan is only bringing a normal track back to just two of the 21 middle schools in the district.
D.C.’s charter decline
With all these captured institutions, the other major second-image reform camp — the school choice movement — looks to move around the implementation barriers in education (poorly performing schools and districts, ideologically misaligned teachers unions) by introducing market dynamics and choice/exit options for families. The theory is that by changing the institutions at the selection level, you can bypass the deep-rooted implementation barriers.
Truthfully, I’m ambivalent about charters. I want states that have voted for charter programs to have the best-run charters they can, and I find the weaponization of alternative schooling systems by both the left and the right to be a form of irresponsible governance. But I also believe, in the words of a colleague, that often the only way to get governments to listen to people without financial resources is to give them the option to leave. So if implemented thoughtfully, things like education savings accounts, voucher programs, and charter expansion can meaningfully improve many poor families’ educational opportunities.
This piece is already getting too long, and also I don’t want to give charter advocates short-thrift, so I’ll have more to say in a future post. A couple things are worth mentioning, though. First, as a second-image solution, the school choice movement will eventually be confronted with problems arising in the other two images. How to address the teacher credentialing pipeline, or the curriculum production pipeline, or ideologically opposing or incompetent state education agencies? Teacher shortages are already a national problem — there are 3.8 million teachers, by the way — and so it’s not like there’s a big teacher store where you can go replace the one you got if it starts quoting John Dewey.
Different charter systems will eventually start to be run by different groups of people. And so who gatekeeps the new curriculum directors — or the curricular menus they choose from — when personnel start to shift and turn over? Matt Yglesias has an article on the collapse of KIPP schools in D.C. that’s worth a read, especially if you have the three-image framing from this article in mind.
The mechanisms that charters are meant to use to bypass dysfunction in the public school system existed in D.C., but the end result was a collapse in proficiency anyway.
Image 3: Chaos in the System
This third image operates a bit differently than the other two.
The first and second images identify what we call in law “proximate” or “efficient” causes. These are the kind of answers you would get if you asked “Why did you eat my sandwich?” and I answered “Because I didn’t know it was yours” (first-image), or “Because there wasn’t one for me” (second-image). The third image, in contrast, is a structural one, and so the kinds of causes it will identify are not proximate or efficient, but permissive causes. That is, it answers with reference to features of the system we occupy, identifying why wars (or sandwich thefts) are just the kinds of things that will happen given the lack of structural constraints on the different actors and their arrangements in the system.
“Because no one was there to stop me!”
In international relations, that structure is known as anarchy, or the absence of a sovereign authority above states. But Waltz’s insight isn’t quite as pessimistic as it might first seem. He doesn’t argue that nothing exists to push against those forces producing war, and so anarchy does not mean that wars have to be constant or omni-present. It’s just that no amount of purely first- or second-image reforms will be sufficient to prevent them completely.
So what do I mean by “educational anarchy” in the American system? The diagnosis at the systems-level here is a bit bleak, at the front end: there’s just no equivalent institution to, say, an FDA or NTSB that has merely been defunded or broken. Road safety, drug efficacy, and consumer technology look worlds apart today than they did when our grandparents were alive. Educational progress has not caught up.
THAT INSTITUTION has never existed. 14,000 school districts are making relatively independent decisions about what to teach, how to teach it, and what success or failure looks like. A shared, inter-district-ly legible quality control infrastructure isn’t anywhere on the horizon. Enforcement mechanisms to police failed and faulty curricula aren’t set up, nor are usable and widely available feedback loops mediating outcomes and practices.
This is a kind of educational state of nature. And just as Waltz’s anarchy doesn’t mean a complete lack of order — there are alliances, norms, balances of power, treaties, etc. — educational anarchy doesn’t mean a complete lack of regulations and guidance. As I said above, there’s an ocean of regulations governing licensing, accreditation, accommodations, civil rights compliance, funding formulas, and reporting requirements. But a strikingly little amount touches the quality of the educational products, or improves the methods themselves.
Unlike the anarchy of international relations, though, educational anarchy is not an inevitable structural condition. (A world government seems unlikely.) Rather, it is more of a policy gap that we have, in some sense, chosen not to fill, and one that entrenched actors in the system have prevented us from filling, to different degrees. (More on this coming in Part 2.)
Consider these “seven unexcused absences” from the educational ecosystem that would be unimaginable in the context of food and drug testing, or automobile and transportation safety. I don’t mean to claim that each of these are totally or completely absent — it’s just that functionally they are:
No independent evidence base.2 There is no institution charged with generating rigorous, unbiased evidence about what works in education, the equivalent of NIH-funded clinical trials that establish a treatment’s efficacy before products enter the regulatory pipeline. The Institute of Education Sciences (IES) is the closest approximation, but it has been gutted to just a ~couple dozen staff during DOGE’s tenure. Even before the DOGE disaster, though, there were issues with the material that IES produced in its What Works Clearinghouse not getting into the hands of teachers and school/district officials, or not impacting the flawed research paradigms found in education schools nationwide.
No pre-market testing.3 There is no requirement that a curriculum be tested for efficacy before it reaches students. Karen Vaites’ framing — that there’s no FDA for education — is the pithiest description of the problem. Jo Boaler’s Fluency Without Fear went from her website, to California’s Math Framework, and then into classrooms nationwide without an efficacy trial ever getting in the way. Even under Common Core, some of the top K–2 literacy programs in use diverged from evidence-based practices.
No standards or gatekeeping.4 There is no nationally-determined efficacy threshold any given curriculum has to clear, and no institution with the authority and legitimacy to approve or reject it. A Johns Hopkins report from 2019 found that only 17 states exercised formal authority over curriculum decisions, and even 14 of those 17 ended up approving more weak curricula than strong. Common Core was the closest attempt, nationally speaking, but it was stunted politically and flawed in its execution. Institutions that arose to fill in these gaps, like EdReports, never had regulatory authority, and they seem to be having their own issues with evidence at the moment.
No post-market surveillance.5 Once a curriculum is adopted, it’s incredibly difficult to systematically track whether it’s really working, or where. Hardly half of the states publicly share any data on which curricula districts adopt, and that’s not to speak of the gap between what is formally adopted and what teachers actually end up using in the classroom.
No meaningful feedback loops.6 What happens when a curriculum fails? Not much, besides hopefully being replaced by the districts dealing with the fallout. The informational chain from research → curriculum → classroom → student outcomes is, for the most part, rather unidirectional. Moreover, parents are increasingly receiving report cards that don’t match test scores, and research shows parents put more weight in grades than standardized tests. Even when districts offer their own numbers, they’re frequently incorrect, distorted, and unaccountable to any regulatory body — like when SFUSD removed middle school algebra and for years tried to pad the statistics to show that it was working.
No recall mechanism.7 Even when an instructional approach or a curriculum is contradicted by the evidence, there are strikingly few mechanisms or pathways for the curriculum to be easily removed. Even though it was contradicted by decades of cognitive science, balanced literacy took decades and a podcast to start being dislodged from American schools. And even though Columbia’s Teachers College shut down Lucy Calkins’ operation, she just relaunched under a different name: “Mossflower.” (Odd.)
No professional licensing tied to knowledge.8 Education schools routinely teach methods contradicted by the best available evidence-bases. For example, a 2023 study from the National Council on Teacher Quality (NCTQ) found that only a quarter of the 700 teacher prep programs they investigated taught all five components of evidence-based reading instruction (phonemic awareness, phonics, fluency, vocabulary, and comprehension). A teacher can be licensed, fully credentialed, and trained in methods that, essentially, don’t really work. It’s sort of like if chiropractic made up a big chunk of what medical schools taught.
Taken together, these seven absences characterize a system that is, in the Waltzian sense, anarchic. Not because it’s completely lawless or without rules, but because it lacks any common standard for progress. These seven absences are, collectively, seven different dimensions of the system’s illegibility. And because of this, critically, there can be no overarching authority capable of making sure that what makes its way into classrooms is backed by evidence and standards.
No Child Left Behind, for all the criticism that it has gotten over the past few decades, did diagnose the correct kind of problem. That is, someone needs to actually be checking whether students are actually learning, and in this sense standards and accountability measures deserve more credit than they typically get. Under NCLB, NAEP scores did improve, and for a time the achievement gap started to shrink, though by 2010 that trend started to wither.
And yet NCLB failed, and the way it failed is this essay’s whole argument in miniature. The accountability pressure came from above — the third image — while the practitioner environment (first-image), the institutional incentives (second-image), and the informational environment (a three-image problem) were never aligned or reformed to support it. In the first image, teachers resented NCLB because they were being evaluated on metrics that were either bad or that they hadn’t been trained to produce, and parents and students resented it for a mix of good and bad reasons, resulting in an unmanageable political pressure bubbling underneath the entire reform effort.
In the second image, education schools and curriculum providers continued to push unworkable theories and bad products, and it seems like no coincidence that “balanced literacy” and constructivist pedagogy reached new modes of dominance around that same time. Finally, political power brokers like the national teachers unions also organized against NCLB, and schools frequently responded to this environment by gaming metrics and student outcome reports. Given the limits of the law as it was written and the education environment it was thrown into, I’m not sure there was any version of No Child Left Behind that would have been politically feasible or epistemically possible.
Karen Vaites (once again) puts it best:
An accountability theory of change assumes that schools know what to do to raise outcomes, and if we just put the right carrots or sticks in place, they will do it. I don’t actually believe that’s the case, writ large. We’re in an education ecosystem where educators receive (and often believe) loads of misguided signals about what works to improve outcomes.
Put another way: If states implemented new, tougher accountability schema today, schools would be just as (or more) likely to embrace the newest faddish tech-enabled solutions (“just like iReady, untested for efficacy, but now AI-enabled so it’ll work this time!”) as they would to embrace better curricula, like those in Louisiana and Tennessee.
And so after a decade of backlash from all sorts of stakeholders, in 2015 Congress relented and replaced NCLB with the Every Student Succeeds Act, handing accountability back to the 50 states. Most of them went on to loosen their standards, which is reflected in the growing disconnect between NAEP scores and state proficiency reports. For now, the national system has given up on meaningful third-image reform. Beyond the Southern Literacy Surge — which is not just one uniform story, but a family resemblance of ed reforms that seem to be playing out promisingly, more or less — a couple other states, like Virginia, have also recently experimented with third-image reforms. But education policy is more and more the focus of our increasingly polarized and partisan politics, and so now after new elections, many such reforms will hang in the balance.
The Southern Literacy Surge is a promising but fragile exception to what I’ve just been discussing. As I mentioned above, each state that’s a part of this reform — Mississippi, Louisiana, Tennessee, and Alabama — is doing things a bit differently, and so will see varying degrees and durations of success. Louisiana’s model, however, is the most ambitious. It spent less on reading reform than New York did, and got dramatically better results. The reason isn’t some magic dust in the Louisiana soil, but rather because it built state-level architecture that touched all three images simultaneously: a state curriculum review process with real teeth (third-image gatekeeping), aligned practitioner training (first-image), and accountability mechanisms with statewide buy-in and institutional levers (third-image).
While the limits are its state borders, Louisiana’s reform is the most ambitious three-image solution existing today. But without being institutionalized above the state level, it too can be reversed. And so can bad ideas spread within, if the right curriculum provider or education academic manages to convince state officials that their goals are aligned.
The Paradox
THE MAJOR education reform movements of the modern era can be understood as largely single-image fixes applied to a three-image problem. Teacher PD, science of reading mandates, and evidence-based instruction are all first-image fixes that are vulnerable to capture from the second-image institutions. School choice, charter expansion, or even fully democratizing the entire teacher workforce and school system into some federated united workers’ republics — these solutions don’t do anything about the first-image information void, or the second-image academic capture, or the lack of third-image architecture to ensure that, more or less, everyone is paddling up the right kind of creek. Can I make a stupid Birdbox reference? It’s like Birdbox!
The result is that education is simultaneously one of the most regulated yet least quality-controlled systems in American life. Deluges of compliance procedures and paperwork govern everything but the quality of the educational practices and materials. The compliance infrastructure, moreover, creates the illusion that someone is checking, or that progress is being made. But we’ve started going backwards faster than ever.
The costs of this backsliding haven’t been evenly distributed. As I’ve written about before, families with resources are the ones that can escape dysfunction in the public schools. (And I haven’t even mentioned any of the bad policies shaping disability accommodations, school discipline, and safety, which are the other key drivers of the flight.) Private schools, tutors, real estate in better districts, summer programs, limited vouchers, and an opaque college application process all work in favor of families with resources.
I cannot emphasize this point strongly enough. It is happening nearly everywhere. 41 states are experiencing declines, and 21 are going to see a 5% drop by 2030. Every major city will be affected. It’s happening in New York, Boston, Seattle, D.C., San Francisco, Portland, Detroit, Baltimore, Los Angeles, Chicago, and countless other smaller cities. A slightly different version of this story is playing out in places like Houston, Indianapolis, Salt Lake City, Cleveland, St. Louis, Baton Rouge, Montgomery, Little Rock, Columbus, and San Antonio, as families choose charters, private vouchers, and homeschooling as alternatives for their kids. I wish the trend were merely a result of declining birth rates, but that’s just another structural dynamic that’s going to start squeezing schools more and more.
If we keep ignoring the problem, then the policies that most affect families will increasingly be influenced by childless professionals with law degrees, like me. (Which is a bad thing.) What that means is that local and state governments will become more beholden to the kinds of people that don’t have skin in the game. What’s at stake, then, is the future political responsiveness of some of our most important local institutions.
This situation is completely unsustainable for a public education system, and more broadly it cannot sustain a literate, civic society. But the inequalities produced by education’s auto-liquidation also reach down to the instructional level, too. In the classrooms, two types of students will be able to navigate those recycled romantic education theories. Sometimes they’re the students whose parents are attentive enough to look beyond their kid’s 5th-grade report card, figure out that they can’t actually read, and then actually do something productive with the rage that I’m certain immediately takes hold.
The other type of student that succeeds in a learning environment that centers an adult’s idea of a child’s ego is, well, the student who likely should have some sort of academic ego. They’re the ones who likely have academically gifted parents, who are probably surrounded by books, and who can most reliably learn to read or multiply on their own. So who do we think is left falling through the cracks?
Tragically, we have allowed the reformers most responsible for the situation to parade the word “equity” around as if they’re the only ones that care about it. But it’s clear they hardly have any useful sense of the word, let alone what it should entail. Because if they did they would realize universal literacy and number fluency are absolutely essential building blocks for any society of equals, for any society that is fair, just, or free.
In Part 2…
If the institutional architecture for quality control doesn’t exist, then what fills the void? The answer is that three shadow powers take up the negative space: prestige, patronage, and politics. Reputations in the education academy, marketplace, and state ecosystems frequently substitute for evidence; funding relationships between the fed, states, districts, academia, curriculum providers, taxpayers, and schools stand in for accountability mechanisms; and partisan politics and political leverage substitute for meaningful quality control. Together, these forces determine which curricula reach classrooms, which reforms survive implementation, and which voices end up shaping policy.
The parallel for next time, it turns out, is not some story of modern regulatory failure. Rather, it’s to something older, a world before the institutions we take for granted existed at all.
Related Articles
See REIMAGINING THE INSTITUTE OF EDUCATION SCIENCES; U.S. Department of Education shares vision for federal research after DOGE cuts - Chalkbeat; The Federal Government Hasn’t Been Meeting Our Need for Unbiased Ed. Research (Opinion); A future for IES?; ‘Back to the Dark Ages’: Education Research Staggered by Trump Cuts – The 74; and A more expansive approach to studying what works in education | Brookings
See Grade Inflation Nation - by Joshua Dwyer; Parents trust report cards more than test scores — with consequences for kids; Nearly 60% of grades don’t match student test scores | K-12 Dive; Interpreting Performance: Evidence on Signal Weighting in Human Capital Investment | Becker Friedman Institute; Many Parents Value Grades Over Test Scores, Missing Signals to Intervene – The 74; and Can states do anything about grade inflation?
See Achievethecore.org :: Comparing Reading Research to Program Design: An Examination of Teachers College Units of Study; Why Education Experts Resist Effective Practices; What Happened When States Dropped Teacher Licensing Requirements?; Teach reading, not guessing: Connecting what teachers learn to what students need; A Commentary on the Misalignment of Teacher Education and the Need for Classroom Behavior Management Skills - PMC; and, perhaps most importantly, How Ed Schools Became a Menace














This rings true. One of the challenges in creating true (and effective) vocational education systems in America is the lack of real governance levers in the system. We try to do everything via funding.