xAI

Ultra-fast inference with Grok models

xAI

Ultra-fast Grok inference

Cutting-edge AI from xAI. Grok models offer real-time information access and ultra-fast inference.


Setup

1. Get API Key

Get your API key from console.x.ai

2. Add Environment Variable

.env.local
XAI_API_KEY=xai-...

3. Usage

import { generateText } from '@yourgpt/llm-sdk';
import { xai } from '@yourgpt/llm-sdk/xai';

const result = await generateText({
  model: xai('grok-3-fast-beta'),
  prompt: 'What are the latest trends in AI?',
});

console.log(result.text);

4. Streaming (API Route)

app/api/chat/route.ts
import { streamText } from '@yourgpt/llm-sdk';
import { xai } from '@yourgpt/llm-sdk/xai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: xai('grok-3-fast-beta'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toTextStreamResponse();
}

Available Models

// Grok 4.1 Fast (Latest)
xai('grok-4-1-fast-reasoning')      // Best reasoning, 2M context
xai('grok-4-1-fast-non-reasoning')  // Fastest, 2M context

// Grok 4 Fast
xai('grok-4-fast-reasoning')        // Fast reasoning, 2M context
xai('grok-4-fast-non-reasoning')    // Fastest, 2M context

// Grok 4
xai('grok-4')                       // Flagship model, 256K context

// Grok 3 (Stable)
xai('grok-3-beta')                  // Enterprise flagship, 131K context
xai('grok-3-fast-beta')             // Fast and stable, 131K context
xai('grok-3-mini-beta')             // Smaller model, 32K context

// Grok Code
xai('grok-code-fast-1')             // Optimized for coding, 256K context

Configuration Options

import { xai } from '@yourgpt/llm-sdk/xai';

// Custom API key
const model = xai('grok-3-fast-beta', {
  apiKey: 'xai-custom-key',
});

// With generation options
const result = await generateText({
  model: xai('grok-3-fast-beta'),
  prompt: 'Hello',
  temperature: 0.7,
  maxTokens: 4096,
});

Tool Calling

xAI supports tool calling with all Grok models:

import { generateText, tool } from '@yourgpt/llm-sdk';
import { xai } from '@yourgpt/llm-sdk/xai';
import { z } from 'zod';

const result = await generateText({
  model: xai('grok-3-fast-beta'),
  prompt: 'What is the weather in San Francisco?',
  tools: {
    getWeather: tool({
      description: 'Get weather for a city',
      parameters: z.object({
        city: z.string(),
      }),
      execute: async ({ city }) => {
        return { temperature: 65, condition: 'foggy' };
      },
    }),
  },
  maxSteps: 5,
});

Speed Comparison

xAI Grok offers ultra-fast inference:

ProviderTokens/second
xAI Grok~400+
OpenAI~50-80
Anthropic~50-80

Grok's speed makes it ideal for real-time applications, chatbots, and agentic use cases.


With Copilot UI

Use with the Copilot React components:

app/providers.tsx
'use client';

import { CopilotProvider } from '@yourgpt/copilot-sdk/react';

export function Providers({ children }: { children: React.ReactNode }) {
  return (
    <CopilotProvider runtimeUrl="/api/chat">
      {children}
    </CopilotProvider>
  );
}

Pricing

ModelInputOutput
grok-4-1-fast$0.20/1M tokens$0.50/1M tokens
grok-3-fast-beta$0.10/1M tokens$0.30/1M tokens
grok-3-mini-beta$0.05/1M tokens$0.15/1M tokens

Check x.ai/api for current rates.


Next Steps

On this page