Overview
- Moonshot has made Kimi‑K2.7‑Code available on its Kimi APIs and published the weights on Hugging Face under a Modified MIT license that permits commercial use with attribution.
- The model uses a mixture-of-experts architecture with about 1 trillion total parameters and roughly 32 billion active parameters per query, which activates only a subset of weights to reduce per-request compute.
- Moonshot published API rates of $0.95 per million input tokens, $4.00 per million output tokens, and a $0.19 per million token rate for cache hits to lower costs for repeated or templated coding tasks.
- The company reports benchmark gains over K2.6 (including double-digit improvements on several coding tests) and says the model cuts reasoning-token use by about 30 percent, though those figures are company-reported and need independent validation.
- Moonshot is pitching K2.7 as a budget-friendly alternative to high-performing closed models like Anthropic’s Claude Fable 5, a move that could change cost calculations for startups and teams running long-context, high-volume coding pipelines.