ProgrammingFeatured520 views7 likes

The assembly code of AI: How computing's evolution predicts LLM development

From punch cards to personal computers, computing evolved from expert-only systems to accessible technology over 40 years. Large Language Models appear to be following this same path at unprecedented speed, suggesting we're in the "assembly language era" of AI - with similar democratization ahead.

Sahar Carmel

• Director AI enablement

July 8, 2025 • 10 min read

The assembly code of AI: How computing's evolution predicts LLM development

The history of computing reveals a striking pattern: every revolutionary technology begins as an expert-only tool requiring manual memory management and low-level programming before evolving into accessible systems with built-in state management. Large Language Models appear to be following this exact trajectory, compressed into years rather than decades.

From punch cards to prompts: Early computing workflows

In the 1950s through 1970s, programming was a physical act. Developers wrote code on special coding sheets, carefully distinguishing zeros from the letter O using local conventions like the "slashed zero." These sheets went to keypunch operators who transferred the code to punch cards using IBM 026 machines - table-sized devices where a single typing error meant re-punching an entire card. Programs were edited by physically reorganizing card decks and replacing individual cards.

The daily workflow was brutally inefficient. Programmers submitted their card decks to operators behind a counter, waited in line during peak times, and received results hours later in alphabetically labeled cubby holes. As one 1982 programmer described: "Write your Fortran program in a coding sheet. Purchase an adequate amount of new punch cards... At a secret moment, a secret little truck would take our work to a secret place."

Memory constraints dominated everything. The IBM System/360 Model 40 had 128 KB of core memory weighing 610 pounds. Programmers used overlay systems to run programs larger than available memory, manually dividing code into self-contained blocks that loaded and unloaded as needed. This required intimate knowledge of memory timing - on drum-based systems like the IBM 650, programmers calculated the optimal physical location for each instruction based on drum rotation speed to minimize latency.

Assembly language programming demanded deep hardware understanding. Programmers wrote in IBM Basic Assembly Language using mnemonics like ADD, SUB, and MVC (Move Characters), which could only move 256 bytes at a time. When programs crashed, they produced "core dumps" - absolute binary representations of memory that programmers had to manually reverse-engineer back to assembly language to debug.

The transition to interactive computing through time-sharing systems like MIT's Compatible Time-Sharing System (CTSS) in 1961 was revolutionary. Running on IBM 7094 computers with dual 32K-word memory banks, CTSS allowed dozens of users to feel they had dedicated access through rapid context switching. This dramatically improved productivity by enabling continuous problem-solving rather than waiting hours for batch results.

The democratization drive: Pain points and visionaries

Early computing's inaccessibility created enormous pressure for change. Mainframes cost $4.6 million in 1970 and ran at 12.5 MHz - equating to $368,000 per MHz compared to $0.34 per MHz for a 2007 Dell computer. These room-sized machines were "128 times slower, more than 8,000 times as expensive, and more than 1 million times as expensive in terms of cost per MHz."

Several visionaries recognized this problem and pursued radically different solutions:

Gary Kildall created CP/M in 1974, the first commercial microprocessor operating system, selling for just $70 per copy. His revolutionary BIOS (Basic Input/Output System) concept in 1976 separated hardware from software, enabling the same OS to run on 3,000 different computer models by 1981. Kildall believed in making source code available and viewed computers as learning tools rather than profit engines - his work ethic "resembles that of the open-source community today."

Steve Wozniak designed the Apple I and II with radical simplicity. His philosophy was anti-commercial: "My idea was never to sell anything. It was really to give it out." He focused on making computers that "just worked" for regular people through elegant, cost-effective engineering that prioritized accessibility over profit.

Steve Jobs took a different approach, focusing on the complete user experience. His marketing philosophy emphasized three principles: empathy (understand customer feelings), focus (eliminate unimportant features), and impute (people judge by appearance). Jobs brought graphical interfaces from Xerox PARC to the mass market, believing "simplicity is the ultimate sophistication."

Ken Thompson and Dennis Ritchie created UNIX with elegant simplicity as the goal. As they wrote in 1974: "UNIX is very simple, it just needs a genius to understand its simplicity." Their design made programs easy to write, test, and run interactively instead of through batch processing. The modular philosophy - tools should do one thing well and work together - influenced all subsequent system design.

Doug Engelbart's "Mother of All Demos" in 1968 showed the future 15 years before Apple commercialized it: the mouse, hypertext, windows, video conferencing, and collaborative editing all working together. His focus on "augmenting human intelligence" established the foundation for modern computing interfaces.

The personal computer revolution crystallized with the Altair 8800 in 1975, followed by the "Trinity of 1977" - the Apple II, TRS-80, and Commodore PET. The TRS-80 sold 10,000 units in its first month at $599.95, including "easy-to-understand manuals that assumed no prior knowledge." This represented computing's transition from expert-only to consumer technology.

Memory management: From manual overlays to automatic systems

The evolution of memory management illustrates computing's progression from manual to automated systems. Early computers had severe constraints: ENIAC had approximately 200 bytes of memory, while the Atlas Computer in 1962 had 96 KB of core memory with 576 KB of drum storage.

Programmers initially used overlay systems, manually dividing programs into self-contained blocks. IBM System/360 programmers wrote control statements like:

Code

OVERLAY A
  INCLUDE SYSLIB(MOD3)
  OVERLAY AA
    INCLUDE SYSLIB(MOD4)

The Atlas Computer revolutionized this with the world's first virtual memory system in 1962, called "one-level storage." Using 32 Page Address Registers for hardware-based address translation and a learning algorithm that predicted page usage, Atlas eliminated manual overlay management. Tom Kilburn estimated this improved programmer productivity by a factor of 3.

John McCarthy's garbage collection for LISP (1959) created the first automatic memory management system. His elegant mark-and-sweep algorithm used just three steps: mark all accessible memory locations, scan for unmarked locations, and add them to free storage. As McCarthy noted: "Once we decided on garbage collection, its actual implementation could be postponed."

Modern memory hierarchies evolved from simple drum-and-core systems to today's complex multi-level caches. The 1950s hierarchy of registers, magnetic drums, and tape evolved into the modern stack of L1/L2/L3 cache, DRAM, SSDs, and cloud storage - each generation adding abstraction layers that hid complexity while improving performance.

Operating system evolution: Abstraction as accessibility

Operating systems evolved through distinct generations, each solving specific problems through increased abstraction.

First generation batch systems (1950s-1960s) like GM-NAA I/O automated job sequencing, processing 60 test jobs per hour compared to manual operation's much lower throughput. Second generation time-sharing systems (1960s-1970s) like CTSS and Multics enabled interactive computing for multiple users.

Third generation personal systems (1970s-1980s) brought computing to individuals. CP/M's BIOS abstraction allowed hardware independence, establishing the pattern:

Application Program → Command Processor → Disk Operating System → BIOS → Hardware

Fourth generation GUI systems (1980s-1990s) made computing visual. AmigaOS achieved preemptive multitasking in just 256KB of memory - 10 years before Windows 95 and 15 years before Mac OS X. Its 13KB Exec microkernel demonstrated that sophisticated systems could run on limited hardware.

Each generation added abstraction layers. Device drivers evolved from direct hardware programming to standardized interfaces. Hardware Abstraction Layers (HAL) isolated kernels from hardware specifics. System calls provided consistent programming interfaces - UNIX's open(), read(), write() paradigm remains fundamental today.

POSIX standardization (1988) enabled software portability across UNIX variants, while Windows evolved from Win16 to Win32 APIs, maintaining backward compatibility through complex "thunking" systems. These standards transformed programming from hardware-specific to platform-agnostic development.

The LLM parallel: Assembly language for artificial intelligence

Current LLM usage patterns strikingly mirror early computing evolution. LLMs are fundamentally stateless processors - they don't retain memory between interactions, requiring all context to be included with every prompt. This mirrors early CPUs with no persistent storage, where programs loaded fresh each time.

Context windows have expanded rapidly but remain constrained:

GPT-1 (2018): 512 tokens
GPT-4 (2023): 32,768 tokens
Gemini 1.5 Pro (2024): 2 million tokens

Yet context scaling follows a quadratic relationship - doubling text length requires four times the memory and compute, creating constraints reminiscent of early memory limitations.

Modern LLM development resembles assembly programming. Frameworks like LangChain provide low-level control through "micro-orchestration" (prompt management, I/O processing) and "macro-orchestration" (multi-step workflows, stateful applications). Developers describe debugging these frameworks as sometimes taking longer than implementing from scratch - echoing early programming's complexity.

Current memory solutions parallel early computing's evolution. Vector databases (Pinecone, Chroma, Weaviate) provide external memory through embedding and retrieval, similar to how drum storage extended early computer memory. RAG (Retrieval-Augmented Generation) architectures inject relevant context at runtime, mimicking overlay systems that loaded code segments as needed.

The user divide is stark. Simon Willison observes that "LLMs are power-user tools—they're chainsaws disguised as kitchen knives." Current usage requires understanding prompt engineering, context management, and model limitations - knowledge barriers similar to early computing's assembly language requirements.

Evidence for the evolutionary trajectory

The parallels are compelling:

Statelessness and orchestration: Both early computers and LLMs require external systems for memory and complex orchestration for sophisticated tasks. The evolution from manual memory management to virtual memory mirrors current development of vector databases and persistent context systems.

User segmentation: Early computing's expert operators, departmental users, and eventual consumers mirror today's AI researchers, developer-practitioners, and ChatGPT users. The skill gap between prompt engineers and general users resembles the divide between assembly programmers and end users.

Tool evolution: Just as computing evolved from assembly to high-level languages, LLM tools are progressing from raw API calls to frameworks like LangChain to visual interfaces like ChatGPT. Each layer adds abstraction while hiding complexity.

Timeline compression: Computing's 40-year evolution from mainframes to personal computers is occurring in just 4-5 years for LLMs. ChatGPT reached 100 million users in 2 months - the fastest adoption in history.

However, key differences challenge the analogy:

Probabilistic vs. deterministic: LLMs' probabilistic behavior creates fundamentally different challenges than deterministic computing
Market concentration: Unlike computing's gradual democratization, LLM capabilities remain concentrated among few providers
Computational costs: While computing costs decreased exponentially, LLM costs remain high due to quadratic scaling

The trajectory ahead

If LLMs follow computing's path, we can expect:

Memory abstraction layers will emerge, hiding vector database complexity behind simple interfaces - the equivalent of virtual memory for AI systems
"Operating systems" for LLMs will standardize memory management, tool integration, and multi-agent coordination, similar to how CP/M standardized microcomputing
High-level languages will replace prompt engineering, just as FORTRAN replaced assembly - making LLM programming accessible to non-experts
Local deployment will democratize access as models shrink and hardware improves, paralleling the shift from mainframes to personal computers
Standardization efforts equivalent to POSIX will enable portability across LLM providers and architectures

The evidence strongly supports the thesis that LLMs are following computing's evolutionary trajectory, compressed into a much shorter timeframe. We're currently in the "assembly language era" of AI - powerful but requiring expert knowledge, with memory management challenges dominating development effort.

The visionaries who democratized computing - Kildall's open architecture, Wozniak's simplicity, Jobs' user focus, and Thompson's elegance - provide a roadmap for LLM evolution. Their core insight remains valid: technology should augment human capability, not restrict it to experts. As virtual memory freed programmers from managing overlays, emerging LLM memory systems will free developers from context window gymnastics, enabling the next phase of AI democratization.

The assembly code era of AI won't last long. If history is our guide, we're perhaps 2-3 years from the equivalent of the first personal computers - AI systems that "just work" for regular people, with memory, state, and complexity hidden behind intuitive interfaces. The only question is whether we'll compress 40 years of computing evolution into 5 years or 10.

Continue Reading

Back to Blog