Pay-per-call Groq LLM chat completions via MPP (Tempo/pathUSD settlement)
What it does
This endpoint provides OpenAI-compatible chat completions powered by Groq's ultra-fast inference engine, accessed through the Locus Micropayment Protocol (MPP). It supports a range of models including Llama 3.3 70B, DeepSeek R1 Distill, Gemma 2, Qwen, and others. Pricing varies by model and token count, ranging from approximately $0.005 to $0.10 per request, settled in pathUSD on the Tempo L2 network.
The request schema follows the standard OpenAI chat completion format: callers supply a model ID and a messages array, with optional parameters for temperature, top_p, max_completion_tokens, tool/function calling (up to 128 tool definitions), structured output via response_format (json_object, json_schema, or text), stop sequences, seed for deterministic sampling, and reasoning_format control (hidden, raw, or parsed). The endpoint is POST-only at /groq/chat; HEAD and GET probes return 404, which is expected behavior for a POST-only route.
A companion endpoint at /groq/models lists available models for a fixed fee of 5000 base units (approximately $0.005 in pathUSD with 6 decimals). The payment intent is "charge" (one-shot per call) using method "tempo". No API key is needed; payment is handled inline via the MPP 402 challenge-response flow. Documentation references point to the official Groq console docs and a Locus-hosted markdown file for agent consumption.
Capabilities
Use cases
- —AI agents that need fast LLM inference without managing API keys
- —Pay-per-call chat completions for prototyping or low-volume workloads
- —Tool-calling and function-calling workflows with sub-second latency
- —Structured JSON output generation from conversational prompts
- —Comparing multiple open-source models (Llama, DeepSeek, Gemma, Qwen) through a single endpoint
Fit
Best for
- —Agents with Tempo/pathUSD wallets needing keyless LLM access
- —Latency-sensitive chat completion tasks
- —Developers wanting OpenAI-compatible API without subscription commitments
Not for
- —High-volume production workloads where direct Groq API keys would be cheaper
- —Audio transcription (Whisper) or TTS (PlayAI) — this endpoint is chat-only
- —Users without crypto wallet infrastructure for MPP settlement
Quick start
curl -X POST https://groq.mpp.paywithlocus.com/groq/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "Hello!"}],
"max_completion_tokens": 256
}'Example
Request
{
"model": "llama-3.3-70b-versatile",
"top_p": 0.9,
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain quantum entanglement in two sentences."
}
],
"temperature": 0.7,
"response_format": "text",
"max_completion_tokens": 256
}Endpoint
Quality
A full OpenAPI schema with request parameters and payment metadata is available, but the endpoint returned 404 on HEAD/GET probes (expected for POST-only routes) so no live 402 challenge was captured. No crawled documentation pages returned useful content, and no example response schema is provided. Pricing is described in the OpenAPI x-payment-info but the chat endpoint amount is null (varies by model/tokens), making exact cost unclear.
Warnings
- —No 402 challenge was captured because the probe used HEAD/GET on a POST-only endpoint; liveness not fully confirmed
- —Response schema is not documented — the 200 response description is just "Successful response"
- —Chat endpoint amount is null; actual cost varies by model and token count with no detailed breakdown available
- —Currency address 0x20c0... is assumed to be pathUSD (6 decimals) based on Tempo method context but is not independently verified
- —Crawled pages all returned 404 JSON errors; no human-readable documentation was retrievable from the provider origin
Citations
- —Endpoint supports models including Llama 3.3, DeepSeek R1, Gemma 2, Qwen, and othershttps://groq.mpp.paywithlocus.com
- —Chat completion pricing ranges from $0.005 to $0.10 per request, varying by model and tokenshttps://groq.mpp.paywithlocus.com
- —Payment method is Tempo with charge intent; /groq/models costs 5000 base unitshttps://groq.mpp.paywithlocus.com
- —Official Groq API reference available at console.groq.com/docshttps://console.groq.com/docs
- —Locus MPP agent documentation at beta.paywithlocus.com/mpp/groq.mdhttps://beta.paywithlocus.com/mpp/groq.md