China’s MiniMax debuts M1 AI model that it says costs 200x less to train than OpenAI’s GPT-4

The company says it spent just $534,700 renting the data center computing resources needed to train M1. This is nearly 200-fold cheaper than estimates of the training cost of ChatGPT-4o, which, industry experts say, likely exceeded $100 million (OpenAI has not released its training cost figures).

The difference may be that independent developers have yet to confirm MiniMax’s claims about M1. In the case of DeepSeek’s R1, developers quickly determined that the model’s performance was indeed as good as the company said. With Butterfly Effect’s Manus, however, the initial buzz faded fast after developers testing Manus found that the model seemed error-prone and couldn’t match what the company had demonstrated. The coming days will prove critical in determining whether developers embrace M1 or respond more tepidly.

Geopolitical and national security concerns have also lessened the enthusiasm of some Western businesses to deploy Chinese-developed AI models. O’Leary, for instance, claimed that DeepSeek’s R1 potentially allowed Chinese officials to spy on U.S. users.

But few things win customers more than free access. Right now, those who want to try MiniMax’s M1 can do so for free through an API MiniMax runs. Developers can also download the entire model for free and run it on their own computing resources (although in that case, the developers have to pay for the compute time). If MiniMax’s capabilities are what the company claims, it will no doubt gain some traction.

The other big selling point for M1 is that it has a “context window” of 1 million tokens. A token is a chunk of data, equivalent to about three-quarters of one word of text, and a context window is the limit of how much data the model can use to generate a single response. One million tokens is equivalent to about seven or eight books or one hour of video content. The 1 million–token context window for M1 means it can take in more data than some of the top-performing models: OpenAI’s o3 and Anthropic’s Claude Opus 4, for example, both have context windows of only about 200,000 tokens. Gemini 2.5 Pro, however, also has a 1 million–token context window, and some of Meta’s open-source Llama models have context windows of up to 10 million tokens.

source

Share This Article

Feds seize $225 million in crypto from crooks who ran giant ‘pig butchering’ operation

Exclusive: Crypto startup Nook raises $2.5 million from Coinbase Ventures, Defy.vc, and UDHC

China’s MiniMax debuts M1 AI model that it says costs 200x less to train than OpenAI’s GPT-4

Leave a Reply Cancel reply

Latest News

‘Shark Tank’ investor Kevin O’Leary says only a third of people can become successful entrepreneurs—and the rest will never be ‘free’

Two Workers at Filing Agent to SEC Edgar Get Insider Charges

Why Boeing’s new CFO Jay Malave is ‘critical’ to a turnaround

Former E*Trade CEO: How to get the opening of private markets for retail investors right—with the benefit of hindsight

We influence 20 million users and is the number one business and technology news network on the planet

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.