Together AI

Cost-effective open-source model inference — Llama, DeepSeek, Qwen, Gemma and more

Together AI is a high-performance inference platform for open-source models. It offers fast, scalable serving for Llama, DeepSeek, Qwen, Gemma, Mistral and many others through an OpenAI-compatible API.


Setup

1. Install packages

npm install @yourgpt/copilot-sdk @yourgpt/llm-sdk openai

Together AI uses an OpenAI-compatible API, so the openai package is the only peer dependency needed.

2. Get API key

Sign up and get your API key at api.together.xyz/settings/api-keys.

3. Add environment variable

.env.local
TOGETHER_API_KEY=your-key-here

4. Streaming API route

app/api/chat/route.ts
import { streamText } from '@yourgpt/llm-sdk';
import { togetherai } from '@yourgpt/llm-sdk/togetherai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toTextStreamResponse();
}

5. Generate text

import { generateText } from '@yourgpt/llm-sdk';
import { togetherai } from '@yourgpt/llm-sdk/togetherai';

const result = await generateText({
  model: togetherai('deepseek-ai/DeepSeek-V3'),
  prompt: 'Explain quantum entanglement simply.',
});

console.log(result.text);

Available Models

// DeepSeek
togetherai('deepseek-ai/DeepSeek-V3')      // 128K ctx, tools
togetherai('deepseek-ai/DeepSeek-V3.1')     // 128K ctx, tools
togetherai('deepseek-ai/DeepSeek-R1')       // reasoning model

// Llama
togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo')  // 131K ctx, fast
togetherai('meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo')  // 130K ctx
togetherai('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo')
togetherai('meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo')

// Qwen
togetherai('Qwen/Qwen3.5-397B-A17B')       // 262K ctx
togetherai('Qwen/Qwen3.5-9B')

// Gemma
togetherai('google/gemma-4-31B-it')

// Kimi
togetherai('moonshotai/Kimi-K2.5')          // 262K ctx

// GLM
togetherai('zai-org/GLM-5.1')              // 202K ctx

Any model ID listed on together.ai/models works.


Configuration

import { togetherai } from '@yourgpt/llm-sdk/togetherai';

// Explicit API key
const model = togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo', {
  apiKey: 'your-key',
});

// Custom base URL (e.g. self-hosted or proxy)
const model = togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo', {
  baseURL: 'https://my-proxy.example.com/v1',
});

Tool Calling

Many Together AI models support tool calling:

import { generateText, tool } from '@yourgpt/llm-sdk';
import { togetherai } from '@yourgpt/llm-sdk/togetherai';
import { z } from 'zod';

const result = await generateText({
  model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
  prompt: 'What is the weather in Miami?',
  tools: {
    getWeather: tool({
      description: 'Get weather for a city',
      parameters: z.object({ city: z.string() }),
      execute: async ({ city }) => ({ temperature: 82, condition: 'sunny' }),
    }),
  },
  maxSteps: 5,
});

deepseek-ai/DeepSeek-R1 is a reasoning model and does not support tool calling. Use DeepSeek-V3 or a Llama model for tool use.


With Copilot UI

app/providers.tsx
'use client';

import { CopilotProvider } from '@yourgpt/copilot-sdk/react';

export function Providers({ children }: { children: React.ReactNode }) {
  return (
    <CopilotProvider runtimeUrl="/api/chat">
      {children}
    </CopilotProvider>
  );
}

Next Steps

On this page