Amazon Partners with Anthropic to Build World's Largest AI Supercomputer

Amazon Partners with Anthropic to Build World's Largest AI Supercomputer

Amazon is teaming up with Anthropic to build one of the world’s most powerful artificial intelligence supercomputers. This new supercomputer will be five times larger than the cluster currently used to create Anthropic’s strongest model.

When it’s finished, Amazon expects it to be the largest AI machine reported globally. It will feature hundreds of thousands of Amazon’s latest AI training chip, Trainium 2.

Matt Garman, the CEO of Amazon Web Services, announced this exciting project, called Project Rainer, at the company’s Re:Invent conference in Las Vegas. Along with this, he shared several other updates that highlight Amazon’s growing role in generative AI.

Garman also revealed that Trainium 2 will soon be available in specialized Trn2 UltraServer clusters designed for training advanced AI. Many companies already use Amazon’s cloud to build and train custom AI models, often alongside Nvidia’s GPUs. However, Garman pointed out that the new AWS clusters will be 30 to 40 percent cheaper than those using Nvidia’s GPUs.

Although Amazon is the largest cloud computing provider in the world, it has often been seen as lagging behind competitors like Microsoft and Google in generative AI. This year, Amazon has invested $8 billion in Anthropic and quietly rolled out various tools through an AWS platform called Bedrock to help companies leverage generative AI.

At Re:Invent, Amazon also showcased its next-generation training chip, Trainium 3. This new chip promises to deliver four times the performance of its predecessor and will be available to customers by late 2025.

“The numbers are impressive,” said Patrick Moorhead, CEO and chief analyst at Moore Insight & Strategy. He noted that Trainium 3 has achieved significant performance gains thanks to improvements in the interconnect between chips. This interconnect is crucial for developing large AI models, enabling fast data transfer between chips.

While Nvidia will likely remain a dominant player in AI training for a while, Moorhead believes competition will increase in the coming years. Amazon’s innovations show that Nvidia isn’t the only option for training AI.

Before the event, Garman told WIRED that Amazon plans to introduce a range of tools to help customers manage generative AI models. He mentioned that these models are often too expensive, unreliable, and unpredictable.

Some of these new tools will enhance smaller models using larger ones, manage numerous AI agents, and ensure that a chatbot’s output is accurate. Amazon builds its own AI models for product recommendations on its e-commerce platform, but it primarily serves as a platform for other companies to develop their AI solutions.

Even though Amazon doesn’t have a ChatGPT-like product to showcase its AI capabilities, the breadth of its cloud services gives it an advantage in selling generative AI to others. “The range of AWS is going to be a significant factor,” Dickens noted.

Amazon’s custom chips will also help make the AI software it sells more affordable. “Silicon is going to be a key part of the strategy for any major cloud provider moving forward,” Dickens explained. He added that Amazon has been developing its custom silicon longer than its competitors.

Garman mentioned that more AWS customers are moving from demos to building commercially viable products and services that incorporate generative AI. “We’re excited to see customers evolve from AI experiments to real applications,” he shared with WIRED.

Many customers are less focused on pushing the limits of generative AI and more interested in making the technology cheaper and more reliable.

A new AWS service called Model Distillation can create a smaller model that is faster and less expensive to run while still maintaining similar capabilities to a larger model. “For example, if you’re an insurance company, you can input a set of questions into an advanced model and then use that to train a smaller model to specialize in those topics,” Garman explained.

Another new tool, Bedrock Agents, will allow users to create and manage AI agents that automate tasks like customer support, order processing, and analytics. It includes a master agent that supervises a team of AI agents, providing performance reports and coordinating changes. “You can create an agent that acts as the boss of all the other agents,” Garman said.

Garman expects companies will be particularly interested in Amazon’s new tool for ensuring chatbot output accuracy. Large language models can sometimes produce errors, and current methods for keeping them on track aren’t perfect. Clients like insurers, who can’t afford mistakes, are eager for this kind of safeguard. “When you ask, ‘Is this covered by my insurance?’ you want the model to answer correctly,” Garman emphasized.

Amazon’s new verification tool, called Automated Reasoning, is different from a similar product OpenAI announced earlier this year. It uses logical reasoning to analyze a model’s output. For it to work, a company must convert its data and policies into a format suitable for logical analysis. “We take natural language, translate it into logic, prove or disprove the statement, and then provide a rationale for why the statement is true or false,” explained Bryon Cook, a distinguished scientist at AWS.

Cook noted that formal reasoning has been used for decades in areas like chip design and cryptography. This approach could also be applied to build chatbots that handle airline ticket refunds or provide HR information without errors.

Companies can combine multiple systems featuring Automated Reasoning to create more sophisticated applications and services, including those with autonomous agents. “Now you have communicating agents that engage in formal reasoning and share their rationale,” Cook said. “Reasoning will become increasingly important.”