The Cheapest Margin on the Board: AI and Voter-File Hygiene
The highest-leverage thing AI does for a turnout program isn't modeling who's persuadable — it's cleaning the list before you spend a dollar against it. The margin math on a dirty voter file.
Everybody buys the same voter file. Almost nobody fixes it. That’s the cheapest margin on the board.
Two campaigns in the same district buy the exact same voter file from the exact same vendor on the same Tuesday. By Election Day one of them has wasted fifteen percent of its mail budget and the other hasn’t. The file didn’t change. What changed is that one of them treated the file as a finished product and the other treated it as raw material.
This is the part of a turnout program nobody writes about. The conversation about AI in campaigns is all happening at the glamorous end — synthetic electorates, message testing, the poll that never sleeps. Meanwhile the single highest-leverage thing AI does for a turnout program is the least glamorous task in all of politics: cleaning the list. Not modeling who’s persuadable. Just making sure the name, the address, and the status on each record are true before you spend a dollar acting on them.
Here’s why that matters more than it sounds. A mail-and-chase program is a machine for spending a fixed budget against a list of human beings, on a deadline that does not move. Every record in that list is an instruction: send this person a ballot application, mail this person a reminder, route a knock to this door, point a phone call at this number. If the record is wrong, the instruction is wrong, and the money is gone — not misallocated, gone. A mail piece to someone who moved two years ago doesn’t underperform. It does nothing. A chase call to a landline that was disconnected in 2023 isn’t low-yield. It’s zero-yield. And here’s the trap: none of it shows up as failure. It shows up as “we ran the program.” The waste is invisible because the dead records sit quietly in the universe looking exactly like the live ones.
Now the scale of the rot, with real numbers. Americans move constantly — about 25.9 million people changed residences in 2024, roughly 11.8 percent of the population, and that’s at a record low mover rate. One in nine of the people on your file moved since the last cycle, and a meaningful share moved across a precinct, county, or state line that breaks their registration. Layer death on top: this April, North Carolina alone identified roughly 34,000 deceased individuals still sitting on its voter rolls after a single database comparison. The list-maintenance machinery that’s supposed to catch this — NCOA runs, Social Security death matches, the ERIC cross-state checks used by some two dozen states — is real but it is monthly, lagged, and uneven from jurisdiction to jurisdiction. The file you buy is always stale at the edges. The only question is whether you find the stale records before you mail them or after.
This is where AI changes the economics, and it has nothing to do with chatbots. The hard part of file hygiene was never knowing the rules — everyone knows a deceased flag matters. The hard part is resolution at scale: deciding whether “Robert J. Smith, 412 Oak” and “Bob Smith, 412 Oak St Apt 2” are one voter or two, whether a forwarding address is a real move or a snowbird’s winter rental, whether two records that share a household are duplicates or a father and son. That’s entity resolution, and it used to require either an expensive enterprise data vendor or a junior staffer burning a week on a spreadsheet doing it by eye. Language models are unusually good at exactly this fuzzy-matching, normalize-the-mess, judge-the-edge-case work. The voter-file industry knows it — the 2026 file refresh added a forward-looking “Moved” flag specifically to catch voters matched to a new registration in another state, and match-rate quality (how cleanly your file maps to real, reachable people) is now the thing serious data shops compete on. The leverage isn’t that AI finds voters you didn’t have. It’s that AI lets a small team scrub a universe down to the records that are actually true, in hours, for the price of compute instead of the price of a data contract.
I’ve built both sides of this. Running Georgia as a state director for America PAC, we were responsible for a turnout operation that ultimately delivered 650,000 voters for President Trump. At that scale the file underneath the program isn’t a detail — it’s the program. Every percentage point of dead, moved, or duplicate records is mail printed for nobody and field hours pointed at empty houses, multiplied across hundreds of thousands of targets. You feel the cost of a dirty file in your gut long before it shows up in a report, and you learn to fix the file first, because everything downstream inherits its errors.
That conviction is why I built Campaign Compass — a piece of software I shipped myself, as a non-engineer, by writing the build logs in plain English and letting AI handle the code. A lot of what Compass does is exactly this unglamorous resolution work: take the raw file, find the duplicates, flag the movers, normalize the addresses, and hand the operator a universe they can actually trust before they spend against it. The point of telling you a non-engineer built it is the point of this whole essay: the file-hygiene work that used to be gated behind a vendor’s enterprise contract is now something a disciplined operator can do directly, because the AI does the part that used to require a specialist. The moat around clean data is draining, and it’s draining toward whoever bothers to walk through it.
So here’s the “so what” for anyone setting a 2026 turnout budget. Before you spend a dollar on a clever AI persuasion tool, ask what you’re spending on the file the whole program runs on. The order of operations matters: a brilliant chase strategy executed against a fifteen-percent-rotten universe is worse than a plain strategy executed against a clean one, because the dirty file taxes every single touch you make. Run the math on your own program. If even ten percent of your file is dead, moved, or duplicated — a conservative read given that one in nine voters moves every year — then a tenth of your mail spend, your chase hours, and your field walk is being burned on records that can never produce a vote. In a race decided by a few hundred ballots, that is not housekeeping. That is the margin. We beat a $450 million bond by 319 votes by being disciplined about exactly this kind of thing. The buying question is simple: before this tool helps me reach voters, does anything in my stack make sure the voters on my list are real? If the answer is no, that’s the first check you should write, and it’s the cheapest one on the board.
Winning on the Margins is where I write about the parts of campaigns nobody else does — mail ballot, ballot chase, the voter file, and field. If you build or fund turnout programs, subscribe here: https://margins.catoconsultinggroup.com






