This page covers the fastest raw-API paths in Tensormesh. For direct HTTP callers, treatDocumentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
429 as rate limiting, honor Retry-After when present, and be conservative about retrying non-idempotent POST requests automatically.
Use it when you want to:
- make a first successful Inference API request with
curlfrom explicit environment variables - make a first successful Control Plane request with
curl
1. Choose The Surface
- Control Plane: management APIs such as users, models, billing, tickets, logs, and metrics
- Inference API: serverless OpenAI-compatible
POST /v1/chat/completionsplusmodels,completions,responses,tokenize,detokenize,health, andversion
- Control Plane uses
Authorization: Bearer <access_token> - Inference API uses
Authorization: Bearer <API_KEY>for POST routes; the public host also servesGET /v1/models,GET /health, andGET /versionwithout auth
2. Fastest Standalone Inference Request
If you already have explicit inference credentials, you do not need the CLI for a first raw inference request.YOUR_SERVERLESS_MODEL_NAME with a serverless model name that is available on your target host.
Other verified serverless routes on this host are /v1/models, /v1/completions, /v1/responses, /tokenize, /detokenize, /health, and /version. Use the dedicated pages under Serverless API Reference when you need those request and response shapes.
If you have Control Plane access for the same Tensormesh environment, discover
published serverless models with tm billing pricing serverless list and use
the returned pricing[].model value in the request body. If you only have
inference credentials, or you are targeting a different serverless host
override, ask your operator or admin for the exact serverless model string
for that host before sending
the request. Read Choose A Serverless Model Name
if you need the full decision flow.
Streaming Example
Serverless SSE example:POST /v1/completions and POST /v1/responses when the request body includes "stream": true.
The stream is emitted as data-only SSE and terminates with data: [DONE].
3. Get A Control Plane Bearer Token
If you already have a Control Plane bearer token, export it directly:tm auth whoami and the request below both use GET /auth/profile, which is the stable bearer-token validation endpoint for the Control Plane.
4. First Control Plane Request
Use the current default Control Plane base URL, or replace it with an explicit override for your environment. If you are already using the CLI flow, the current default Control Plane host ishttps://api.tensormesh.ai, and you
can confirm whether you are still on that host or on an environment-specific
override by inspecting the resolved controlplane_base first:
curl:
5. What Is Public Versus CLI-Flow Internal
GET /auth/profileis a stable bearer-token endpoint and is published in the Control Plane API reference./auth/cli/start,/auth/cli/exchange, and/auth/cli/refreshare used by the CLI browser-login flow. They are documented in the CLI auth guide, but they are not the stable raw-API integration surface for external clients.
6. If Something Fails
401on Control Plane:- run
tm auth whoamiagain - refresh with
tm auth refresh
- run
401on inference:- check the explicit API key you passed, or
[overrides].gateway_api_keyinconfig.tomlif you are using the CLI-assisted flow
- check the explicit API key you passed, or
- not sure which credentials are loaded:
- run
tm auth status --exit-status - run
tm infer doctor --exit-status
- run

