MiniMax M1

MiniMax M1 is a frontier Mixture-of-Experts model with a 1M-token context window, 456B total parameters, and an 80K output limit. With top performance on AIME 2025, SWE-bench, and LiveCodeBench, it delivers scalable long-form reasoning for agentic and engineering-grade use cases.

Try it now

import requests import json # for getting a structured output with indentation response = requests.post( "https://api.aimlapi.com/v1/chat/completions", headers={ "Content-Type":"application/json", # Insert your AIML API Key instead of : "Authorization":"Bearer ", "Content-Type":"application/json" }, json={ "model":"minimax/m1", "messages":[ { "role":"user", # Insert your question for the model here, instead of Hello: "content":"Hello" } ] } ) data = response.json() print(json.dumps(data, indent=2, ensure_ascii=False))

AI Playground

Test all API models in the sandbox environment before you integrate. We provide more than 200 models to integrate into your app.

Testimonials

Our Clients' Voices

Cameron Pak

Full-stack design engineer

Powerful! I've used lemonfox, which is cheaper, but this is a little more and far more robust. thanks for sharing

Ivan I

Founder

"Nice alternative to Huggingface" What I like best about AI/ML APIs is the extensive range of available models they offer. These APIs provide access to a variety of pre-trained models that can perform tasks across different domains such as natural language processing. AI/ML APIs, especially those providing access to open-source large language models (LLMs), are solving a myriad of problems by democratizing access to advanced AI capabilities.

Ihor Herasymov

Founder of aicrunch.io

This saves money! If you use chatgpt daily. How come so not many users are knowing this…

MiniMax M1

MiniMax M1 is a 456B-parameter Mixture-of-Experts model optimized for ultra-long context (1M tokens) reasoning, outperforming peers in coding, math, and logic benchmarks via AIML API.

MiniMax M1 Description‍

MiniMax M1 is an open-weight Mixture-of-Experts transformer with 456B total parameters and up to 1 million tokens of context. With 80K output capacity, it is purpose-built for massive input processing, logical analysis, and deep code reasoning. Ideal for RAG pipelines, legal and scientific workflows, and agentic tools.

‍Technical Specification

Technical Specification

Context Window: 1,000,000 tokens
Output Capacity: Up to 80,000 tokens
Architecture: Sparse MoE Transformer with Lightning Attention
Parameters: 456B (45B active per token)
API Pricing:
- Input tokens: $0.5 or $1.4 per million tokens (tiered)
- Output tokens: $2.3 per million tokens

Performance Metrics

Key Capabilities

Full-scale document and codebase comprehension across million-token inputs
Fast inference and optimized MoE routing
Efficient serving and compatibility
Supports tool use and planning in agentic workflows

Optimal Use Cases

Code Engineering: Process and refactor large repositories in a single pass
Document Analytics: Perform reasoning over legal, technical, or regulatory data
RAG Systems: Use as a long-context backend for question answering
Mathematical Reasoning: Step-by-step symbolic and logical analysis

Code Samples

Comparison with Other Models

Vs. GPT-4o: M1 offers 1M context tokens vs GPT-4o’s 128K; better for large inputs
Vs. Claude 4 Opus: M1 provides more context (1M vs 128K); both excel in reasoning
Vs. Gemini 2.5 Pro: M1 leads in token capacity and scale for structured inputs