Groq

Name: Groq API
Availability: InStock
Rating: 4.7 (376 reviews)
Author: Groq

AI & Machine Learning · Fast AI Inference

Ultra-fast AI inference powered by Groq's custom LPU chips. Free tier offers blazing-fast access to LLaMA, Mixtral, and Gemma models.

No Credit CardForever FreeUltra FastBest for Students

Duration

Forever (with rate limits)

Credit Card

Not Required

Rating

4.7/5 (376)

Geo Restrictions

None

Rate Limits

Requests per Minute30 RPM

Requests per Day14,400 RPD

Tokens per Minute6,000 TPM

Other: Rate limits vary by model. LLaMA 3.1 70B: 30 RPM, 14,400 RPD.

Free Tier Details

✅ Included

LLaMA 3.1 70B & 8B
Mixtral 8x7B
Gemma 2 9B
Whisper Large v3 (speech-to-text)
LLaVA (vision)
Tool use / function calling

❌ Not Included

—Dedicated capacity
—SLA guarantee
—Priority queue

How to Get Your Free API Key

https://console.groq.com/signup

Go to API Keys and create a new key.

https://console.groq.com/keys

Groq uses an OpenAI-compatible API format for easy integration.

How to Test Your Key

Send a chat completion request. Notice the blazing-fast response time!

curl https://api.groq.com/openai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.1-70b-versatile","messages":[{"role":"user","content":"Hello!"}]}'

Expected: JSON response in ~200ms — much faster than typical cloud inference.

Hidden Limitations

Token per minute limits are relatively low (6,000 TPM for some models)
Model selection is more limited than OpenAI/Anthropic
No fine-tuning support
Occasional capacity issues during peak demand

Official Links

Documentation Pricing Dashboard Website

Last verified: 2026-02-15 · Last updated: 2026-02-15