Documentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
May 27, 2026
New Serverless Models Ten models are now available for serverless inference:| Model | Family | Parameters | Context |
|---|---|---|---|
| DeepSeek-V4-Flash | DeepSeek-V4 | 284B | 1M |
| gemma-4-31B-it | Google Gemma 4 | 30.7B | 256K |
| GLM-5.1-NVFP4-MTP | GLM-5.1 | 433B | 128K |
| MiniMax-M2.5 | MiniMaxAI | 228B | 196K |
| Kimi-K2.6 | Kimi | 1T | 256K |
| gpt-oss-120b | OpenAI gpt-oss | 116B | 131K |
| gpt-oss-20b | OpenAI gpt-oss | 20B | 131K |
| Qwen3.5-397B-A17B-FP8 | Qwen3.5 | 397B | 262K |
| Qwen3.6-27B-FP8 | Qwen3.6 | 27B | 262K |
| Qwen3-Coder-30B-A3B-Instruct | Qwen3 Coder | 30.5B | 262K |
Improvements
Improvements
Overview Page — The Overview page has been updated with quick actions linking to Serverless, Demos, Claude Code, and Codex CLI. The stats shows Cache Hit Rate, Models Called, Monthly Spending, and Account Balance. A cached tokens chart visualizes your cache performance over time, and a serverless model catalog is shown at the bottom for quick access.Cache Savings — Management → Cache Savings now shows estimated savings, a stacked area chart of input spend vs. cache savings over time, and a top-models-by-savings table.SDK & CLI — The SDK and CLI have been updated to focus on serverless inference.New Docs Pages — Dedicated documentation for External Storage, Serverless Usage, Cache Savings, and a Glossary of key terms.
April 29, 2026
Improvements
Improvements
Account Deletion — You can now delete your own account from Management → Account. Account deletion requires email confirmation and is permanent.Billing Transaction Details — Transaction details on the Billing page now show a deeper breakdown.
April 15, 2026
Serverless Inference Run models via API with no infrastructure to manage. Pay-per-token pricing with $0 for cached tokens. Track per-model token usage and costs under Operations → Serverless Usage. The API is OpenAI-compatible — point any existing SDK tohttps://serverless.tensormesh.ai. (Serverless Inference)
Tensormesh Demos
A new Demos section with interactive benchmarks. Navigate to Operations → Demos to run live inference demos and observe KV cache acceleration across TTFT, E2E latency, and inter-token latency.
CLI Documentation
A new CLI tab in the docs with guides and a full command reference for the tm CLI tool — covering installation, authentication, inference, model management, billing, and admin workflows. (CLI)
Python SDK
A new Python SDK tab with guides for the tensormesh package — covering sync and async clients, inference, control plane resources, and migration from OpenAI/Fireworks. (Python SDK)
API & SDK Reference Documentation
Full interactive API & SDK docs for all Tensormesh endpoints with an in-browser playground and code examples in cURL, Python, and JavaScript. (API & SDK Reference)
Improvements
Improvements
Email Notification Preferences — Toggle email notifications for deployment updates from Management → Account.Support Ticket Attachments — You can now attach files when creating support tickets.Quick Actions on Dashboard — Quick action cards on the Overview page for faster navigation to common operations.
March 17, 2026
MiniMax-M2.5 on ServerlessMiniMaxAI/MiniMax-M2.5 is now available for serverless inference. Built on a 228B-parameter Mixture-of-Experts architecture with a 196K context window, it excels at advanced reasoning, coding, and building autonomous systems that combine tool orchestration with large-scale information processing.
Improvements
Improvements
Cost Saving Breakdown — Cache Savings page now displays a step-by-step savings breakdown with calculation formulas.Browser Notifications — Notification preferences can now be configured from Management → Account.

