MiniMax M1 is a frontier Mixture-of-Experts model with a 1M-token context window, 456B total parameters, and an 80K output limit. With top performance on AIME 2025, SWE-bench, and LiveCodeBench, it delivers scalable long-form reasoning for agentic and engineering-grade use cases.
import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
"Content-Type":"application/json",
# Insert your AIML API Key instead of :
"Authorization":"Bearer ",
"Content-Type":"application/json"
},
json={
"model":"minimax/m1",
"messages":[
{
"role":"user",
# Insert your question for the model here, instead of Hello:
"content":"Hello"
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
AI Playground
Test all API models in the sandbox environment before you integrate. We provide more than 200 models to integrate into your app.
MiniMax M1 is a 456B-parameter Mixture-of-Experts model optimized for ultra-long context (1M tokens) reasoning, outperforming peers in coding, math, and logic benchmarks via AIML API.
MiniMax M1 Description
MiniMax M1 is an open-weight Mixture-of-Experts transformer with 456B total parameters and up to 1 million tokens of context. With 80K output capacity, it is purpose-built for massive input processing, logical analysis, and deep code reasoning. Ideal for RAG pipelines, legal and scientific workflows, and agentic tools.
Technical Specification
Technical Specification
Context Window: 1,000,000 tokens
Output Capacity: Up to 80,000 tokens
Architecture: Sparse MoE Transformer with Lightning Attention
Parameters: 456B (45B active per token)
API Pricing:
Input tokens: $0.5 or $1.4 per million tokens (tiered)
Output tokens: $2.3 per million tokens
Performance Metrics
M1 Metrics
Key Capabilities
Full-scale document and codebase comprehension across million-token inputs
Fast inference and optimized MoE routing
Efficient serving and compatibility
Supports tool use and planning in agentic workflows
Optimal Use Cases
Code Engineering: Process and refactor large repositories in a single pass
Document Analytics: Perform reasoning over legal, technical, or regulatory data
RAG Systems: Use as a long-context backend for question answering
Mathematical Reasoning: Step-by-step symbolic and logical analysis
Code Samples
import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
"Content-Type":"application/json",
# Insert your AIML API Key instead of :
"Authorization":"Bearer ",
"Content-Type":"application/json"
},
json={
"model":"minimax/m1",
"messages":[
{
"role":"user",
# Insert your question for the model here, instead of Hello:
"content":"Hello"
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
Comparison with Other Models
Vs. GPT-4o: M1 offers 1M context tokens vs GPT-4o’s 128K; better for large inputs
Vs. Claude 4 Opus: M1 provides more context (1M vs 128K); both excel in reasoning
Vs. Gemini 2.5 Pro: M1 leads in token capacity and scale for structured inputs
Limitations
No vision or multimodal input support
No fine-tuning API exposed
Some tools/platforms may require manual integration
API Integration
Accessible via AI/ML API. Documentation: available here.