NVIDIA NIM
AI & Machine Learning · GPU-Accelerated Inference
NVIDIA NIM (NVIDIA Inference Microservices) provides free API access to state-of-the-art AI models including LLaMA, Mistral, and NVIDIA's own models with GPU-accelerated inference.
Rate Limits
Free Tier Details
$1000 in free credits
✅ Included
- LLaMA 3.1 405B, 70B, 8B
- Mistral Large, Mixtral 8x22B
- NVIDIA Nemotron models
- Code Llama 70B
- Stable Diffusion XL
- Embedding models
- Reranking models
❌ Not Included
- —Dedicated endpoints
- —Custom model deployment
- —SLA guarantee
How to Get Your Free API Key
Go to build.nvidia.com and sign up with your email or existing NVIDIA account.
https://build.nvidia.comExplore the model catalog. Each model has a 'Try' button with a playground.
Click 'Get API Key' on any model page. You'll receive 1,000 free credits.
https://build.nvidia.com/explore/discoverNIM uses an OpenAI-compatible API format, making it easy to switch from OpenAI.
How to Test Your Key
Send a chat completion request using the OpenAI-compatible endpoint.
curl https://integrate.api.nvidia.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"meta/llama-3.1-70b-instruct","messages":[{"role":"user","content":"Hello!"}],"max_tokens":512}'Expected: JSON response with model completion, same format as OpenAI.
Visit your NVIDIA dashboard to see remaining credits.
Hidden Limitations
- 1,000 credits are consumed at different rates per model (larger models cost more)
- Credits don't renew — once used, you need to pay
- Some models may be removed from free tier without notice
- Rate limits vary by model size
Official Links
Last verified: 2026-02-15 · Last updated: 2026-02-15