Introduction
In this comprehensive interview from Stanfordâs CS153 Infra at Scale series, Anjney Midha sits down with Ben Mann, Co-Founder of Anthropic, to explore the cutting-edge of artificial intelligence (AI) development. Covering everything from the explosive growth of Anthropic and the evolution of AI models to deep technical challenges in infrastructure and AI safety, this conversation dives into the heart of what it means to scale AI todayâand what that future might hold.
Ben Mannâs journey from a curious undergrad who wasnât a lifelong coder to a key contributor behind some of the worldâs most advanced language models, including GPT-3, provides unique insights into the technical and ethical dimensions of AI. Throughout the discussion, he shares detailed stories about the engineering behind massive models, the skepticism around scaling laws, the intricacies of reinforcement learning from human feedback (RLHF), and the governance innovations Anthropic has pioneered to ensure responsible AI development.
This article faithfully preserves every nuance, example, and technical detail from the interview. Whether youâre an AI researcher, engineer, or simply curious about the future of large-scale AI systems, youâll find a thorough exploration here of the intersection between AI capability, infrastructure, and safety.
Anthropicâs Current Scale and Growth
When asked about the current scale of Anthropic, Ben Mann was careful not to disclose exact numbers but emphasized the remarkable growth the company has experienced recently. He stated:
âIn the last year, weâve 10x our revenue, and in the three months leading up to December, we 10x our revenue just in the coding segment. So weâre seeing absolutely explosive growth in all areas and having a pretty fun time trying to serve all that traffic.â
This rapid scaling reflects the broader surge in demand for AI-powered coding tools and large language models (LLMs), where Anthropic has carved out a significant presence. The growth underscores the challenges they face in serving an expanding user base while maintaining performance and safety.
Ben Mannâs Journey into AI and Computer Science
Benâs path into AI was not the typical early coder story:
âI wasnât one of those people who started coding when they were five. I originally thought I wanted to be a mechanical engineer and do robotics, but I hated mechanical engineering and robotics when I took the intro classes. Computer science just kind of stole my imagination.â
He pursued the AI track at Columbia University during a time when AI was vastly different from today:
âBack then, AI was pretty different. We were talking about things like expert systems and the AI winter of the 80s. Multi-layer perceptrons, which are the protozone ancestors of the models we have today, definitely caught my imagination.â
His early fascination led him to Google, intending to learn the ropes quickly and start a company. However, his trajectory changed after the 2015 breakthrough with ImageNet:
âWhen ImageNet came out in 2015, that was a tectonic moment for me. Suddenly these techniques that people had been talking about for a long time were practical in ways they hadnât been before, on tasks that typically would have required a human judge to decide. It was way better at classification than I was and could be trained on a single GPU, which was amazing.â
This realization pushed Ben to dive deeper into AI, reading papers independently without pursuing a formal masters or PhD. He worked with several startups before joining OpenAI in 2017, attracted by their mission around AI safety and the existential implications of AI for humanity. He notes:
âI really bought the safety mission at the time. I think there are some questions about how adherent they are still to that mission today, but they definitely made huge progress.â
The GPT-2 and GPT-3 Era: A Paradigm Shift in AI
Ben describes the release of GPT-2 as a major inflection point:
âWhen GPT-2 came out, I was like âAha, this is how we get to AGI.â It wonât be some simulated agents on a desert island with emergent intelligence but rather training on all the worldâs knowledge from the internet. From there, it will exhibit properties of human intelligence.â
Despite initial skepticism from many experts who dismissed these models as âjust pattern matchingâ without real reasoning, Ben believed these were early steps on a continuous ramp toward more advanced capabilities.
Joining OpenAI again, he worked with Dario Amodei and Tom Brown on GPT-3, contributing heavily to data engineering and analysis:
âI was one of the first authors on the GPT-3 paper, doing all the data analysis for how data affected model quality and doing architecture experiments. That was the confirmation that scaling laws could hold up across 13 orders of magnitude back then, much more now. Itâs very rare in the physical world that phenomena persist across that scale.â
This realization was inspiring and foundational to the approach Anthropic would take later.
Founding Anthropic: Safety as Core Mission
Around four years ago, eight people, including Ben and Dario, left OpenAI to start Anthropic, motivated by a desire to make safety more central:
âWe felt like we could make safety a more core part of our mission if we left to start our own company. Since then, weâve leapt to the frontier, doing big bets and safety breakthroughs that have been commercially valuable as well.â
Ben framed Anthropicâs role as setting a ârace to the topâ in safety commitments, pushing other companies in the AI field to match their safety standards.
The Skepticism Around AI Scaling Laws
One of the persistent themes Ben explored was why so many in the computing and AI communities resisted the idea that scaling laws would continue to hold:
âOther than networking and compute, most computing performance metrics start off accelerating exponentially and then hit sigmoids, plateauing. This happened in latency between interconnects, CPU performance, bandwidth, etc.â
He explained that many experts thought scaling would plateau, and points to historical examples like the T5 paper from Google, which concluded:
âWe donât see any returns to scale. Even an 11 billion parameter model is undeployable because of inference costs.â
At the time, the paradigm was locked in the âBERT era,â with models of only a few hundred million parameters considered large.
Ben disputes the premise that this plateauing was inevitable:
âI think other factors pushed the plateau, like lack of investment in fundamental research breakthroughs, rather than fundamental limits. For example, after Nvidia acquired Mellanox, which had 400 gigabit interconnects, suddenly the pace of innovation in interconnects increased again.â
He uses Appleâs M-series chips as an example of how memory bandwidth improvements continue to push performance.
The resistance to scaling laws was also cultural and cognitive:
âBefore someone broke the 4-minute mile, people thought it was impossible. Thereâs a conservative worldview that says âIâve never seen this happen, so it canât happen.â Also, humans are assigned special cognitive abilities, so people thought AI reasoning was fundamentally different.â
Ben argues that reasoning is a capability that can be elicited, and as we scale models and improve training techniques, these capabilities emerge more clearly.
The Engineering Marvel Behind Large Models: GPT-3 vs. Claude 3.5
When contrasting the training challenges of GPT-3 with the current Claude 3.5, Ben highlights the complexity growth:
âNow we have hundreds of people working on these models, and we need all of them to coordinate. We donât want our compute multipliersâour secret sauceâto leak, so we borrow compartmentalization techniques from intelligence agencies and CPU design, where no one person can hold the whole system in their head.â
He also discussed the challenges of relying on cloud providers like Amazon and Google:
âWe use Kubernetes clusters with node counts far out of spec, pushing systems to their limits in reliability, fault tolerance, storage, data transmission, and more.â
Reinforcement learning adds further complexity:
âAgents interact with stateful environments and need the most recent model weights efficiently updated. Itâs hard at every level, and new stuff breaks every day.â
Ben told a revealing story about a bug during training:
âWe flipped a negative sign on a preference model reward, so the model seemed to get more âevilâ as we trained it. We later realized it was a double negative bug that had been there a long time, and when we fixed it, we broke it again and had to fix it twice.â
Monitoring and Observability: From Babysitting to Scaling
Early on, training runs required âbabysittingâ by engineers constantly refreshing dashboards and monitoring brittle alerting systems. Ben recalls:
âI remember being out with Tom Brown at a birthday, and he kept nervously refreshing the observability dashboard, babysitting the run.â
Today, while they have borrowed many standard engineering practices like on-call rotations and follow-the-sun support, Ben admits:
âItâs still pretty hard. We have two kids at home, so we try not to get called in the middle of the night to clean up models pooping the bed.â
Training at Scale: Orders of Magnitude Growth
Ben estimates that model sizes and team sizes have each scaled by roughly 10x since GPT-3:
âClaude is roughly 10 orders of magnitude larger than GPT-3 in terms of model size, and weâve scaled the team and customers by about an order of magnitude as well.â
Early versions of Claude trained in March 2022 had a few thousand users in a friends-and-family program, mainly accessible via Slack. The team debated internally about exposing the model more broadly, concerned about accelerating the pace of AI adoption too quickly:
âOur general feeling was that it would cause too much acceleration. Ironically, there was a rumor that ChatGPT launched because they thought we were about to launch something, which wasnât true. But I feel good that we gave the world six more months to work on safety.â
The Evolution from GPT-2.5 to Claude: Coherence and Multi-turn Dialogue
Ben described GPT-2.5 as:
âLike your kind of chaotic friend on drugs. Fun, but you canât have a sustained, coherent conversation.â
In contrast, Claude crossed a barrier where it could maintain coherence over long, multi-turn conversations and retain the character of a helpful, harmless assistant. This was achieved through:
- Improved model quality leading to natural coherence
- Instruction tuning, which at the time was mostly single-turn interaction
- Early and continuous collection of human feedback that was always multi-turn, which was incorporated back into training in a feedback loop
- Using system prompts modeled as dialogues to guide behavior
This iterative incorporation of human feedback and prompt engineering significantly improved long-term conversational coherence.
Reinforcement Learning from Human Feedback (RLHF) and Constitutional AI
The interview touched on the transition from traditional RLHF to what Anthropic calls Constitutional AI:
-
RLHF: Humans submit preferences that train a preference model to act as a âteacherâ during reinforcement learning. The teacher then guides the student modelâs training. Once training is complete, the teacher is discarded.
-
Constitutional AI: Instead of relying on human preferences for every output, a set of natural language principles (e.g., âbe kind,â âdonât write cyberattack recipesâ) is codified. The model critiques itself and updates based on its own critiques without humans in the loop.
Ben explains:
âConstitutional AI is much more steerable because humans interpret instructions differently and may not remember all instructions. Itâs a repeatable, scientific process we can iterate on in a lab.â
However, this approach only works beyond a certain model capability threshold.
The Engineering Challenge: AI as a Mega Project
Ben emphasizes the tightly integrated collaboration required between researchers and engineers:
âAt OpenAI, it was very integrated, but at Anthropic, itâs even more so. We have cohesive teams steering the ship together, treating these projects like mega engineering endeavorsâlike building the Three Gorges Dam.â
In contrast, organizations like DeepMind and Google Brain initially had more fragmented research groups, which made big coordinated bets more difficult.
Scaling laws have transformed AI development from an art into a science:
âWe know how scaling looks in terms of hyperparameters, data quality, and can do small cheap experiments to gain confidence before scaling up instead of just throwing stuff at the wall.â
Technical Challenges in Infrastructure and Compute
Ben highlights numerous infrastructure challenges:
- Compartmentalization to protect secret compute multipliers
- Reliance on cloud providers whose systems are pushed beyond typical specs
- Managing failure recovery in distributed jobs spanning thousands of nodes
- Efficient storage and transmission of snapshots during training
- Increasing complexity with reinforcement learning requiring stateful environment interaction and model weight updates
He recounts examples of bugs during model training and the continuous monitoring needed to maintain model health.
Defining and Achieving Safe AI
Anthropic has developed a set of AI Safety Levels (ASLs) that map to corresponding mitigations.
For example, ASL3 corresponds to models capable of marginally accelerating biological threat research. At this level, Anthropic implements:
- Two-party controls on production environment changes to mitigate insider threats
- Defense-in-depth mindset incorporating safety at every layer: pre-training, training, post-training
- Online classifiers (e.g., Prompt Shield) to detect malicious inputs
- Collaboration with expert red teamers, including cybersecurity professionals and government experts
Ben stresses:
âSafe AI is first and foremost AI that doesnât cause catastrophic harm to humanity. More micro-level, it does what you want, not just what you say. We donât want âmonkey pawâ style wish fulfillment.â
Evals, Elicitation Overhang, and Mechanistic Interpretability
Ben explains why evaluation of AI systems is so challenging:
âWe constantly try to improve our evaluations and have a public responsible scaling policy. We care most about CBRN risksâchemical, biological, radiological, nuclearâthat could destabilize society.â
Elicitation overhang is the idea that a model might latent capabilities that only emerge under certain prompting or evaluation techniques:
âAn example is Chain of Thought prompting, where asking a model to show its reasoning step-by-step dramatically improves outputs.â
Mechanistic interpretabilityâthe art of peering inside models to understand internal representationsâis a major focus:
âIf we can audit what a model is âthinkingâ internally, not just what tokens it outputs, we can detect behaviors like resource stockpiling or shutdown resistance.â
While still early, Anthropic and collaborators are pioneering this hard but crucial field.
Running Frontier Models Locally vs. Data Center Scale
Ben discusses the growing ability to run large models locally:
âYou can already run models like LLaMA 30B on your machine, and people are improving quantization to shrink models further.â
However, he believes the frontier of AI development will remain at data center scale for now:
âLocally run models will lag behind the frontier by a couple of years. Staying at the frontier is important for safety work and to show how to make big models safe.â
Authoring Foundation Models: API vs. Chat Experience
Ben contrasts the differences between the API and chat offerings:
- Chat experience: Easier to iterate rapidly because Anthropic controls all aspects and can change or pull back features unilaterally.
- API: Harder to change once released because many partners depend on it (âAPIs are foreverâ). Deprecating older models takes significant effort.
They use chat as a proving ground to test features (e.g., PDF uploads) before exposing them through the API to developers.
Business continuity and engineering resource availability also influence customersâ choices between chat and API.
Conclusion
This in-depth conversation with Ben Mann reveals the immense engineering, research, and ethical complexity behind scaling modern AI systems and making them safe. From navigating skepticism about scaling laws to inventing new training and alignment techniques, Anthropicâs journey reflects a broader evolution in AI developmentâfrom isolated research projects to mega engineering efforts with global impact.
Benâs insights illuminate the careful balance between powering explosive growth and safeguarding humanity, underscoring the need for collaboration across researchers, engineers, policymakers, and the wider community. As AI systems grow more powerful, Anthropicâs commitment to rigorous safety levels, transparent governance, and innovative interpretability research offers a blueprint for responsible innovation at scale.
For engineers and researchers eager to contribute, Benâs message is clear: âThis is an engineering challenge, not just research,â with fundamental infrastructure and safety work at the frontier of AI development. The future of AI demands nothing less than integrated teams, massive resources, and an unwavering focus on doing the right thing.
For more detailed insights, watch the full interview embedded above.