Providers

Groq

Ultra-fast inference with Llama and Mixtral models

Groq

Ultra-fast Llama & Mixtral inference

Groq

Blazing fast inference. Groq's LPU hardware delivers 10x faster responses than traditional GPUs.

Groq is ideal for real-time applications where latency matters.


Setup

1. Get API Key

Get your API key from console.groq.com

2. Add Environment Variable

# .env.local
GROQ_API_KEY=gsk_...

3. Configure Provider

<YourGPTProvider
  runtimeUrl="/api/chat"
  llm={{
    provider: 'groq',
    model: 'llama-3.1-70b-versatile',
  }}
>
  <CopilotChat />
</YourGPTProvider>

Available Models

ModelContextSpeedBest For
llama-3.1-70b-versatile128KVery FastGeneral use
llama-3.1-8b-instant128KUltra FastQuick responses
llama-3.2-90b-vision-preview128KFastMultimodal
mixtral-8x7b-3276832KVery FastBalanced
gemma2-9b-it8KUltra FastLightweight

Recommended: llama-3.1-70b-versatile for quality, llama-3.1-8b-instant for speed.


Speed Comparison

ProviderTime to First TokenTotal Response Time
Groq~100ms~500ms
OpenAI~500ms~3s
Anthropic~700ms~4s

Groq is 5-10x faster for most queries.


Configuration Options

llm={{
  provider: 'groq',
  model: 'llama-3.1-70b-versatile',
  temperature: 0.7,
  maxTokens: 4096,
  topP: 1,
}}

Use Cases

Real-Time Chat

Perfect for applications needing instant responses:

// Users see responses appear instantly
<CopilotChat placeholder="Ask anything (instant response)..." />

Autocomplete / Suggestions

// Fast enough for keystroke-level suggestions
const getSuggestions = async (input: string) => {
  // Groq responds fast enough for autocomplete
};

High-Volume Applications

Lower latency = better user experience at scale.


Tool Calling

Llama models support function calling:

useToolWithSchema({
  name: 'quick_search',
  description: 'Search for information quickly',
  schema: z.object({
    query: z.string(),
  }),
  handler: async ({ query }) => {
    const results = await search(query);
    return { success: true, data: results };
  },
});

Pricing

ModelPrice
llama-3.1-70b$0.59/1M tokens
llama-3.1-8b$0.05/1M tokens
mixtral-8x7b$0.24/1M tokens

Very affordable. Check Groq pricing for current rates.


Next Steps

On this page