Microsoft says its newest AI chip Maia 200 is 3 times more powerful than Google's TPU and Amazon's Trainium processor

Photograph of the Maia 200 chip.
Microsoft’s Maia 200 chip is being integrated into its Azure cloud infrastructure (Image credit: Microsoft)

Microsoft has revealed its new Maia 200 accelerator chip for artificial intelligence (AI) that is three times more powerful than hardware from rivals like Google and Amazon, company representatives say.

This newest chip will be used in AI inference rather than training, powering systems and agents used to make predictions, provide answers to queries and generate outputs based on new data that's fed to them.

The new chip delivers performance of more than 10 petaflops (1015 floating point operations per second), Scott Guthrie, cloud and AI executive vice president at Microsoft, said in a blog post. This is a measure of performance in supercomputing, where the most powerful supercomputers in the world can reach more than 1,000 petaflops of power.

The new chip achieved this performance level in a data representation category known as "4-bit precision (FP4)" — a highly compressed model designed to accelerate AI performance. Maia 200 also delivers 5 PFLOPS of performance in 8-bit precision (FP8). The difference between the two is that FP4 is far more energy efficient but less accurate.

"In practical terms, one Maia 200 node can effortlessly run today’s largest models, with plenty of headroom for even bigger models in the future," Guthrie said in the blog post. "This means Maia 200 delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google’s seventh generation TPU."

Chips ahoy

Maia 200 could potentially be used for specialist AI workloads, such as running larger LLMs in the future. So far, Microsoft's Maia chips have only been used in the Azure cloud infrastructure to run large-scale workloads for Microsoft’s own AI services, notably Copilot. However, Guthrie noted there would be "wider customer availability in the future," signaling other organizations could tap into Maia 200 via the Azure cloud, or the chips could potentially one day be deployed in standalone data centers or server stacks.

Guthrie said that Microsoft boasts 30% better performance per dollar over existing systems thanks to the use of the 3-nanometer process made by the Taiwan Semiconductor Manufacturing Company (TSMC), the most important fabricator in the world, allowing for 100 billion transistors per chip. This essentially means that Maia 200 could be more cost-effective and efficient for the most demanding AI workloads than existing chips.

Maia 200 has a few other features alongside better performance and efficiency. It includes a memory system, for instance, which can help keep an AI model’s weights and data local, meaning you would need less hardware to run a model. It's also designed to be quickly integrated into existing data centers.

Maia 200 should enable AI models to run faster and more efficiently. This means Azure OpenAI users, such as scientists, developers and corporations, could see better throughput and speeds when developing AI applications and using the likes of GPT-4 in their operations.

This next-generation AI hardware is unlikely to disrupt everyday AI and chatbot use for most people in the short term, as Maia 200 is designed for data centers rather than consumer-grade hardware. However, end users could see the impact of Maia 200 in the form of faster response times and potentially more advanced features from Copilot and other AI tools built into Windows and Microsoft products.

Maia 200 could also provide a performance boost to developers and scientists who use AI inference via Microsoft’s platforms. This, in turn, could lead to improvements in AI deployment on large-scale research projects and elements like advanced weather modeling, biological or chemical systems and compositions.

Roland Moore-Colyer

Roland Moore-Colyer is a freelance writer for Live Science and managing editor at consumer tech publication TechRadar, running the Mobile Computing vertical. At TechRadar, one of the U.K. and U.S.’ largest consumer technology websites, he focuses on smartphones and tablets. But beyond that, he taps into more than a decade of writing experience to bring people stories that cover electric vehicles (EVs), the evolution and practical use of artificial intelligence (AI), mixed reality products and use cases, and the evolution of computing both on a macro level and from a consumer angle.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.