
Oct 13, 2025 7:00 AM
Programming in Assembly Is Brutal, Beautiful, and Maybe Even a Path to Better AI
Rollercoaster Tycoon wasn’t the most fashionable computer game out there in 1999. But if you took a look beneath the pixels—the rickety rides, the crowds of hungry, thirsty, barfing people (and the janitors mopping in their wake)—deep down at the level of the code, you saw craftsmanship so obsessive that it bordered on insane. Chris Sawyer, the game’s sole developer, wrote the whole thing in assembly.
Certain programming languages, like Python or Go or C++, are called “high-level” because they work sort of like human language, written in commands and idioms that might fit in at a poetry slam. Generally speaking, a piece of software like a compiler transforms this into what the machine really reads: blocks of 1s and 0s (or maybe hex) that tell actual transistors how to behave. Assembly, the lowest of the “low-level” languages, has a near one-to-one correspondence with the machine’s native tongue. It’s coding straight to metal. To build a complex computer game from assembly is like weaving a tapestry from shedded cat fur.
Why would anyone do this? I recently asked Sawyer, who lives in his native Scotland. He told me that efficiency was one reason. In the 1990s, the tools for high-level programming weren’t all there. Compilers were terribly slow. Debuggers sucked. Sawyer could avoid them by doing his own thing in x86 assembly, the lingua franca of Intel chips.
We both knew that wasn’t the real reason, though. The real reason was love. Before turning to roller coasters, Sawyer had written another game in assembly, Transport Tycoon. It puts players in charge of a city’s roads, rail stations, runways, and ports. I imagined Sawyer as a model-train hobbyist—laying each stretch of track, hand-sewing artificial turf, each detail a choice and a chore. To move these carefully crafted pixels from bitmaps to display, Sawyer had to coax out the chip’s full potential. “RollerCoaster Tycoon only came about because I was familiar with the limits of what was possible,” he told me.
Working within the limits? A foreign idea, perhaps, in this age of digital abundance, when calling a single function in an AI training algorithm can engage a million GPUs. With assembly, you get one thing and one thing only, and it is the thing you ask for—even, as many a coder has learned the hard way, if it is wrong. Assembly is brutal and beautiful that way. It requires you to say exactly what you mean.
I’ve done assembly’s creators a disservice. They wanted things to be easier, not harder. I imagine they were tired of loading up punchcards and flipping switches on their steampunk leviathans. Perhaps they dreamed of a world like ours, where computers can do so much with such minimal guidance.
The first assembly language, created in the 1940s by Kathleen Booth (though she has not always gotten her due, surprise surprise), hardly resembled language. Codes stood in for codes. To tell the machine to perform an operation—say, “0,0111” in machine code—you’d instead employ a series of letters and symbols, which a new piece of software, called an assembler, would translate into binary. Soon, the commands got human-friendlier mnemonics like “MOV.”
To know assembly was to know the CPU itself—what it could do and, even more, what it couldn’t. A chip’s physical design, how the circuits connecting the logic gates of AND and XOR are actually laid, defines how it works. Its functions are pretty basic, breaking down instructions into elementary steps: Fetch something from memory and put it in a temporary cubby, known as a register. Decode it there. Perform some operations, like comparing two values, or adding them. Ship it back off the memory.
As chips advanced, new dialects of assembly evolved. The code that landed the first human on the moon was assembly—designed for only one chip, the Apollo 11 Guidance Computer. If you want to read the leaked source code of the Furby, you’ll need fluency in 6502. To hack your Ti-83 calculator, you’ll need z80. Learning the language of one chip—say, Intel’s x86—and then moving to Arm is like studying Arabic in Beirut and then trying to get by in Tunis or Khartoum. Good luck.
I learned x86 assembly in college as a refugee from math. Where my classmates seemed to enjoy the drab incantations of Java, I loved the logic game that was assembly. It was easy to fail, but to fail in ways that were explainable if you looked at the circuits and registers. How masterful I felt coding in the simple commands of this not-quite-language; how fragile I knew that mastery to be. To say, put these bytes there—no, there, at that register, in those capacitors. Remember this. Forget that. To grind away, painting each figurine, one by one.
It’s true that there’s no longer much point in using assembly in the day-to-day work of coding. High-level languages are so efficient that their abstraction is almost always preferable. Even assembly’s inventor moved on to other ventures; one of Booth’s final papers, in the 1990s, used neural networks to match seals with their barks. Sawyer switched over too. He’s been dabbling in home automation recently—lights, temperature sensors, sound systems, and the like, coded on Raspberry Pis using Python, which he initially found “quite off-putting,” he told me. But even on that tiny processor, it gets the job done just fine.
Then along comes something like DeepSeek to remind you that humans can still communicate better with our hardware. Earlier this year, the Chinese company that made these incredibly efficient AI models upended the narrative that AI advancement can come only from more chips and more energy. Assembly was one surprising reason. DeepSeek’s engineers reached into the subfloor of Nvidia’s chips, commanding each individual machine to compress data from 32 bits to 8 bits—sacrificing precision for efficiency—at precisely the right moments. Observers were stunned. You could do that? The DeepSeek engineers had tapped an art most others had forgotten.
I was similarly taken when, in 2023, researchers at DeepMind taught a machine x86 assembly, then asked it to improve on the long-standing sort() function in C. The AI made strange, unintuitive choices, performing odd jumps between registers, and in the end cut precisely one step. A fraction of a millisecond saved, perhaps. But happening countless times a day, now that the new algorithm has been officially adopted.
To me, it was a reminder that we humans created these machines, and even as they appear to spiral into complexity beyond our comprehension, they remain under our command. We can always make them work better. It was like what Sawyer said when he recounted his recent Raspberry Pi–enabled home coding experiment. It was probably just his imagination, but the display had been a little laggy, he thought. He’d redo the code if he could, he said. But alas, Sawyer and the machine did not speak the same assembly language.
Credit: Original Article