Skip to main content

Models

Spike uses its own optimized models by default, designed specifically for real-time conversational AI with personality and memory. You can also configure your personalities to use models from other providers.

Default models

By default, Spike selects the optimal model for each task:
TaskDefault ModelDescription
Chatspike-chat-1Optimized for conversational responses with personality
Voice synthesisspike-voice-1Low-latency text-to-speech with emotion
Voice recognitionspike-stt-1Real-time speech-to-text
Video generationspike-avatar-1Lip-synced avatar rendering
Embeddingsspike-embed-1Memory and knowledge retrieval

Using other providers

You can configure personalities to use models from other providers. Spike acts as a unified interface, handling authentication, rate limiting, and failover.

Supported providers

Anthropic

Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku

OpenAI

GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo

Google

Gemini 1.5 Pro, Gemini 1.5 Flash

Configuration

Set the model when creating or updating a personality:
{
  "name": "My Assistant",
  "system_prompt": "You are a helpful assistant.",
  "model_config": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-20250514",
    "temperature": 0.7,
    "max_tokens": 1024
  }
}

Fallback behavior

If your configured provider is unavailable, Spike can automatically fall back to an equivalent model:
{
  "model_config": {
    "provider": "openai",
    "model": "gpt-4o",
    "fallback": {
      "provider": "anthropic",
      "model": "claude-sonnet-4-20250514"
    }
  }
}

Real-time considerations

For voice and video conversations, latency is critical. Spike’s default models are optimized for real-time use:
  • First token latency: < 200ms
  • Voice synthesis: < 150ms to first audio
  • Avatar rendering: < 300ms to first frame
When using external providers, expect slightly higher latency due to additional network hops. For latency-sensitive applications, we recommend using Spike’s default models or providers with streaming support.

Model selection guidelines

Use caseRecommended
Real-time voice/videoSpike default models
Complex reasoningClaude 3 Opus or GPT-4o
High throughput chatClaude 3 Haiku or GPT-3.5 Turbo
Cost optimizationSpike default or Haiku-class models
Model availability and pricing vary by plan. Contact us for enterprise model options.