Introduction
In a landmark keynote delivered at AI Startup School in San Francisco on June 17, 2025, Andrej Karpathy, former director of AI at Tesla and a leading figure in artificial intelligence and deep learning, laid out a compelling vision for the future of software. Drawing on his extensive experience at Stanford, OpenAI, and Tesla, Karpathy argued that software is undergoing a profound transformationâone that is as fundamental as any since the dawn of computing itself.
He introduced the concept of Software 3.0, an era where natural language becomes the primary programming interface, and large language models (LLMs) act as the new kind of computer. This shift is not just about new tools; it is about building a new computing paradigm, reshaping how developers write code, how users interact with software, and how entire software ecosystems evolve.
This article captures every insight, story, and nuance from Karpathyâs keynote. Weâll explore his detailed analogies, practical examples, and forward-looking advice on how to thrive in this new era of software.
The Evolution of Software: From 1.0 to 3.0
Karpathy began by framing the historical arc of software development. He pointed out that for roughly 70 years, software fundamentally remained the same: humans wrote explicit code to instruct computers. He called this era Software 1.0âthe traditional paradigm of programming where developers write lines of code in languages like C++ or Python.
Then came Software 2.0, a concept Karpathy himself popularized years ago. This era is characterized by neural networksâmodels whose behavior is defined not by explicit code, but by learned parameters (weights). Instead of writing step-by-step instructions, developers curate datasets and run optimization algorithms to “train” these networks. The neural net weights become the new “code,” encoding complex functions such as image recognition or speech understanding.
Karpathy illustrated this shift with the example of the AlexNet image recognizer, a neural network trained to classify images without explicit programming of features. He emphasized that Software 2.0 models were until recently fixed-function computersâspecialized for tasks like classification.
Whatâs changed dramatically is the emergence of Software 3.0, where neural networks, especially large language models, become programmable via natural language prompts. Now, instead of writing code or training weights, developers write English instructions that program the modelâs behavior dynamically. Karpathy described this as a fundamentally new kind of computer:
âYour prompts are now programs that program the LLM. And remarkably, these prompts are written in English. So itâs kind of a very interesting programming language.â
He noted that this is a revolutionary shift because it makes programming accessible in a natural language, breaking down traditional barriers to software development.
Karpathy also highlighted how GitHub repositories are evolving to include not just code, but English interspersed with codeâsignaling this new hybrid programming paradigm.
Programming in English: The Rise of Software 3.0
Karpathyâs excitement about programming in English is palpable. He shared a memorable moment when he tweeted:
âRemarkably, we’re now programming computers in English.â
This tweet captured the imagination of many and reflects a profound change: the new programming language is the language humans already use daily.
He gave a concrete example contrasting traditional sentiment classification. Previously, a developer might write Python code or train a neural network to classify sentiment. Now, with a large language model, one can simply write a few-shot prompt in English instructing the model to perform sentiment analysis. This prompt acts as a program, dynamically guiding the modelâs behavior.
Karpathy emphasized that this new programming paradigm is not just a novelty but a fundamental shift requiring developers to become fluent in multiple paradigms:
âIf youâre entering the industry, itâs a very good idea to be fluent in all of them [Software 1.0, 2.0, and 3.0] because they all have slight pros and cons.â
He stressed the importance of fluidly transitioning between writing explicit code, training neural nets, and programming LLMs with natural language.
LLMs as Utilities, Fabs, and Operating Systems
Karpathy then moved to a fascinating analogy, comparing LLMs to utilities, semiconductor fabs, and operating systemsâthree pillars of modern computing infrastructure.
LLMs as Utilities
He likened LLMs to utilities like electricity, highlighting how labs such as OpenAI, Gemini, and Anthropic invest heavily in capital expenditures (capex) to build these models. The models are then served via APIs, metered by usage, much like electricity consumption.
âWe demand low latency, high uptime, consistent quality… When state-of-the-art LLMs go down, itâs like an intelligence brownout in the world.â
This analogy captures the critical role LLMs play as foundational services powering countless applications.
LLMs as Fabs
Karpathy also compared LLM labs to semiconductor fabs, pointing out the deep tech trees and research secrets concentrated in these organizations. The massive investment in training infrastructure and hardware is akin to building and operating cutting-edge fabs.
He noted the analogy is imperfect because software is malleable and less defensible than physical fabs, but it still conveys the scale and complexity involved.
LLMs as Operating Systems
Most compellingly, Karpathy argued that LLMs resemble operating systems:
âThis is not just electricity or water. These are increasingly complex software ecosystems.â
Like Windows, MacOS, or Linux, LLMs form the platform upon which applications run. There are closed-source providers (OpenAI, Google) and open-source alternatives (LLaMA ecosystem) akin to Linux.
He sketched a vision where LLMs orchestrate memory and compute for problem-solving, with context windows acting as working memory. This is a return to the 1960s era of computing, where time-sharing and batch processing dominated, centralized in the cloud due to the expense of compute.
Karpathy also pointed out the current lack of a general graphical user interface (GUI) for LLMs, comparing direct interaction with ChatGPT to a terminal interface. He suggested that a general-purpose GUI for LLMs has yet to be invented, but many specialized apps are beginning to fill this gap.
The Psychology of LLMs: People Spirits and Cognitive Quirks
Switching gears, Karpathy offered a unique perspective on the psychology of LLMs. He described them as:
âPeople spirits â stochastic simulations of people, where the simulator is an autoregressive Transformer.â
Because LLMs are trained on vast corpora of human text, they develop an emergent psychology: they have encyclopedic knowledge and memory far beyond any individual human, but also significant cognitive deficits.
Superhuman Strengths
Karpathy compared LLMs to an autistic savant, referencing the movie Rain Man:
âThey can remember lots of things, a lot more than any single individual human can because they read so many things.â
LLMs can recall hashes, facts, and patterns with superhuman accuracy and speed.
Cognitive Deficits
However, LLMs hallucinate, making up false information confidently. They lack a robust internal model of self-knowledge and can produce jagged intelligenceâbeing superhuman in some tasks but making mistakes no human would make.
He gave examples like insisting â9.11 is greater than 9.9â or misspelling âstrawberryâ with two Rs.
LLMs also suffer from a form of anterograde amnesia â they do not consolidate knowledge over time like humans do by sleeping and reflecting. Their context windows act as working memory but are wiped regularly, limiting long-term learning.
Karpathy recommended watching movies Memento and 50 First Dates as metaphors for LLM memory limitations.
Security and Gullibility
He cautioned that LLMs are gullible and susceptible to prompt injection attacks, data leakage, and other security risks. These limitations must be carefully managed when building applications.
Designing LLM Apps with Partial Autonomy
Karpathy then explored the practical opportunities that arise from LLMsâ unique capabilities and limitations, focusing on the concept of partial autonomy.
Rather than treating LLMs as fully autonomous agents, Karpathy advocates for building apps where humans and AI collaborate closely, with humans retaining control and oversight.
Example: Cursor
He highlighted Cursor, an AI-powered code editor, as an early exemplar of this approach.
Cursor integrates multiple LLMs and embedding models to assist developers, but still provides a traditional interface for manual work. It orchestrates context management, multiple LLM calls, and applies diffs to code, all while giving users a clear GUI to audit changes.
Karpathy emphasized the importance of GUIs in LLM apps:
âYou donât want to talk to the operating system directly in text. Text is very hard to read, interpret, and understand… A GUI allows a human to audit the work of these fallible systems and go faster.â
The Autonomy Slider
A key design principle Karpathy introduced is the autonomy sliderâa control allowing users to tune how much autonomy the AI has.
In Cursor, users can:
- Use tap completion for small changes (high human control)
- Command K to modify chunks of code
- Command L to change entire files
- Command I for full autonomy over the repo
This flexibility enables users to balance speed and control depending on the task complexity.
Perplexity and Other Apps
Karpathy also mentioned Perplexity, another LLM-powered app with similar features: orchestrating multiple models, citing sources, providing GUIs for auditing, and autonomy sliders for varying levels of AI assistance.
The Human-AI Collaboration Loop
Karpathy stressed the importance of fast, efficient human-AI collaboration loops. Humans generate and verify AI outputs rapidly to maintain control and ensure correctness.
He warned against over-reliance on fully autonomous agents producing massive diffs or outputs without human review:
âEven though 10,000 lines come out instantly, I have to make sure itâs not introducing bugs or security issues.â
He encouraged developers to find best practices for keeping AI âon a leashâ and iterating in small, verifiable steps.
Lessons from Tesla Autopilot: Autonomy and Human-in-the-Loop
Karpathy drew on his experience leading Tesla Autopilot to illustrate how partial autonomy works in practice.
He recounted his first ride in a self-driving car in 2013, which was flawless but still far from full autonomy. Over his tenure, Tesla progressively shifted functionality from traditional coded software (Software 1.0) to neural networks (Software 2.0), gradually deleting legacy code as neural nets improved.
He emphasized that driving is a hard problem and that full autonomy remains elusive even today:
âWe still havenât really solved the problem. Thereâs still a lot of teleoperation and human-in-the-loop driving.â
This experience informs his caution about the hype around fully autonomous AI agents in 2025. He advocates for careful, incremental progress with humans supervising and controlling AI.
The Iron Man Analogy: Augmentation vs. Agents
Karpathy invoked the Iron Man suit as a metaphor for how AI should augment human capabilities:
âThe Iron Man suit is both an augmentation and an agent. Tony Stark can drive it, but it can also fly around autonomously.â
He argued that most AI products today should be more like Iron Man suitsâaugmenting users with partial autonomyârather than fully autonomous robots.
This analogy captures the need for custom GUIs, fast generation-verification loops, and autonomy sliders that allow users to gradually delegate tasks.
Vibe Coding: Everyone Is Now a Programmer
One of the most optimistic parts of Karpathyâs talk focused on how Software 3.0 democratizes programming.
Because LLMs are programmed in natural language, everyone becomes a programmer:
âThis is extremely bullish and very interesting to me and also completely unprecedented.â
Karpathy shared the story of vibe coding, a meme and movement celebrating how natural language programming lowers barriers to software creation.
He showed a heartwarming video of kids vibe coding, highlighting the wholesome and empowering nature of this new paradigm.
Karpathy himself experimented with vibe coding, building iOS apps and web apps without deep knowledge of Swift or devops. He described how writing the code was surprisingly easy, but integrating real-world infrastructure like authentication and payments was still challenging and slow.
His reflections reveal a key insight: while LLMs simplify coding, the surrounding ecosystem of deployment, authentication, and infrastructure remains a bottleneck.
Building for Agents: Future-Ready Digital Infrastructure
Karpathy then addressed the question: if LLMs and AI agents become primary consumers and manipulators of digital information, how should we build software infrastructure?
He proposed that:
- Traditional GUIs and APIs were designed for humans and programs, respectively.
- Now, agents (LLMs) form a third category of digital information consumers.
- To serve agents effectively, software and documentation must become agent-friendly.
Example: lm.txt and Markdown Documentation
Karpathy suggested a new convention like lm.txt
filesâsimple markdown files that explicitly describe a domain or API for LLMs, much like robots.txt
guides web crawlers.
He pointed out that most documentation is written for humans, with instructions like âclick this button,â which LLMs cannot interpret directly.
Companies like Vercel and Stripe are early movers, rewriting docs in markdown and replacing UI instructions with equivalent command-line or API calls that LLM agents can execute.
Tools for Ingesting Data
Karpathy highlighted tools that convert GitHub repos into LLM-friendly formats by concatenating files and building directory structures, enabling LLMs to answer questions about codebases.
He also mentioned Deep Wiki, which analyzes repos and generates comprehensive documentation pages, making software more accessible to AI agents.
The Future of Agent Interaction
Karpathy envisions a future where LLMs can interact with software infrastructure directlyâclicking buttons, making API calls, and navigating documentation seamlessly.
However, he cautioned that it is crucial to meet LLMs halfway by designing infrastructure that is easier and more reliable for them to consume.
Summary and Conclusion: Weâre in the 1960s of LLMs â Time to Build
Karpathy concluded with a powerful call to action:
- We are living in the early days of Software 3.0, analogous to the 1960s era of operating systems.
- LLMs are complex, fallible âpeople spiritsâ that require new infrastructure, interfaces, and programming paradigms.
- There is a massive opportunity to rewrite and build new software leveraging LLMs as utilities, fabs, and operating systems.
- Developers must learn to work with these models collaboratively, designing partial autonomy products with human-in-the-loop verification.
- The democratization of programming through natural language promises a future where everyone can build software.
- We must also build digital infrastructure ready for LLM agents as a new class of software consumers.
Karpathyâs vision is both a roadmap and an invitation:
âItâs an amazing time to get into the industry. We need to rewrite a ton of code. These LLMs are kind of like coders, utilities, fabs, but especially like operating systems. Itâs so early, and I canât wait to build it with all of you.â
Final Thoughts
Andrej Karpathyâs keynote is a masterclass in understanding the seismic shifts underway in software development. By preserving every detail and story from his talk, this article provides a comprehensive resource for anyone eager to grasp the full implications of Software 3.0.
From the historical context and technical analogies to practical app design and future infrastructure, Karpathyâs insights illuminate a path forward for developers, entrepreneurs, and technologists in the AI era.
As we stand at this inflection point, the message is clear: software is changing again, and the future belongs to those who learn to program the new computersâin English.
For further exploration, you can access Andrej Karpathyâs slides here: Slides PDF
More content from Andrej Karpathy: YouTube Channel
Apply to Y Combinator: https://ycombinator.com/apply
Work at a startup: https://workatastartup.com