Vercel AI Gateway Integration Guide For Base Chat

by Dimemap Team 50 views

Hey guys! Let's dive into integrating Vercel AI Gateway into Base Chat. This guide will walk you through the process step-by-step, ensuring we don't break any existing functionality. We're going for a smooth, incremental approach, so you can test and migrate gradually while keeping all your custom provider settings intact. Let’s get started!

Table of Contents

  1. What is Vercel AI Gateway?
  2. Why Use AI Gateway?
  3. Migration Strategy
  4. Prerequisites
  5. Step-by-Step Integration
  6. Preserving Custom Model Settings
  7. Testing Strategy
  8. Rollback Plan
  9. Cost Considerations
  10. Troubleshooting

What is Vercel AI Gateway?

The Vercel AI Gateway is essentially your one-stop shop for interacting with various AI models. Think of it as a unified API that simplifies how we access and manage different AI providers. It offers a single endpoint to access hundreds of models from multiple providers like OpenAI, Anthropic, Google, Groq, and xAI. This means you don't have to juggle multiple APIs and keys. Built-in reliability is another key feature, with automatic retries and failover to backup providers, ensuring your applications remain robust. With centralized management, you can monitor spending, usage, and performance across all providers from one place. This is super helpful for keeping an eye on your resources and optimizing costs. The gateway also simplifies authentication with just one API key instead of managing separate keys for each provider. Plus, it offers seamless integration, working with existing AI SDK code with minimal changes, making it a breeze to implement. Essentially, the Vercel AI Gateway streamlines the whole process of working with AI models, making it more efficient and manageable. By using a single endpoint, developers can switch between models from different providers without changing their code, enhancing flexibility and reducing complexity. The gateway also offers advanced features such as request caching and rate limiting, further optimizing performance and cost. Overall, the Vercel AI Gateway is a powerful tool for any application leveraging AI, providing a reliable, scalable, and cost-effective solution for accessing a wide range of AI models. This ensures that developers can focus on building innovative applications without getting bogged down in the complexities of managing multiple AI provider integrations.

How It Works

Instead of calling provider APIs directly, you route through the AI Gateway. Check out these code snippets to see the difference:

// Before: Direct provider calls
const model = anthropic('claude-sonnet-4-5-20250929');
const model = openai('gpt-4o');
const model = google('gemini-2.5-flash');
// After: Through AI Gateway
const model = gateway('anthropic/claude-sonnet-4-5-20250929');
const model = gateway('openai/gpt-4o');
const model = gateway('google/gemini-2.5-flash');

// Special case: Groq models (already in creator/model format)
const model = gateway('openai/gpt-oss-20b');  // NO groq/ prefix!
const model = gateway('meta-llama/llama-4-scout-17b-16e-instruct');

The AI Gateway forwards requests to the appropriate provider while adding reliability, monitoring, and caching layers. This centralized approach not only simplifies your code but also provides a layer of abstraction, allowing you to switch between providers more easily in the future. The gateway’s ability to cache responses also means that repeated requests can be served faster and at a lower cost, optimizing both performance and budget. Additionally, the monitoring and logging capabilities of the AI Gateway offer valuable insights into the usage patterns and performance of different AI models, helping you make informed decisions about your AI strategy. By handling the complexities of provider integration, the Vercel AI Gateway allows developers to concentrate on the core functionality of their applications, speeding up development cycles and reducing the risk of errors. This abstraction layer also facilitates the implementation of advanced features such as A/B testing of different models and dynamic routing of requests based on performance or cost metrics, further enhancing the flexibility and efficiency of your AI-powered applications.

Important: Groq models (like GPT-OSS, Llama, Kimi K2) are open-weight models already named in creator/model format, so they don't need a groq/ prefix. See Step 3 for details.


Why Use AI Gateway?

Using the AI Gateway brings a ton of benefits to Base Chat. One major advantage is simplified key management. Instead of handling 4+ API keys for different providers (OpenAI, Anthropic, Google, Groq), you'll just need a single AI_GATEWAY_API_KEY. This makes key rotation and security management way easier, not to mention the centralized access control. The AI Gateway also ensures high availability with automatic retries if a provider is down and fallback options to alternative providers. This means better resilience for production workloads, keeping your services running smoothly. For cost optimization, the AI Gateway provides centralized spend monitoring across all providers, giving you the usage analytics needed to identify optimization opportunities and manage rate limits to avoid unexpected costs. From a developer experience perspective, the AI Gateway offers a single dashboard for all LLM usage, unified logging and debugging, and makes it easier to add new providers without new SDK dependencies. This streamlines the development process and reduces complexity. Lastly, it's all about future-proofing. It’s easy to switch providers or models, access new models as they’re added to the gateway, and you benefit from Vercel's commitment to maintaining compatibility. The enhanced monitoring capabilities allow for real-time tracking of API usage, error rates, and latency, which are essential for maintaining a high-quality user experience. By centralizing these functionalities, the AI Gateway also simplifies compliance efforts, providing a clear audit trail of AI usage and costs. This is particularly important in industries with strict regulatory requirements.

Benefits for Base Chat

  1. Simplified Key Management

    • Replace 4+ API keys (OpenAI, Anthropic, Google, Groq) with a single AI_GATEWAY_API_KEY
    • Easier rotation and security management
    • Centralized access control
  2. High Availability

    • Automatic retries if a provider is down
    • Fallback to alternative providers (if configured)
    • Better resilience for production workloads
  3. Cost Optimization

    • Centralized spend monitoring across all providers
    • Usage analytics to identify optimization opportunities
    • Rate limit management to prevent unexpected costs
  4. Developer Experience

    • Single dashboard for all LLM usage
    • Unified logging and debugging
    • Easier to add new providers (no new SDK dependencies)
  5. Future-Proofing

    • Easy to switch providers or models
    • Access to new models as they're added to the gateway
    • Vercel's commitment to maintaining compatibility

Potential Drawbacks

  1. Additional Dependency: Adds Vercel as a middleware layer
  2. Latency: Minimal additional latency (typically <10ms)
  3. Cost: AI Gateway has its own pricing (see Cost Considerations)
  4. Provider Lock-in: Some reliance on Vercel infrastructure

Migration Strategy

Our migration strategy is all about minimizing risk and ensuring a smooth transition. We’re using an incremental, feature-flagged approach that lets us run AI Gateway alongside existing direct provider calls, test with a single model before migrating all models, and easily roll back if any issues pop up. This means zero downtime migration and preserving all custom model configurations. The feature-flagged approach allows for precise control over the rollout process, ensuring that changes are introduced gradually and can be monitored closely. This method also simplifies the process of identifying and addressing any issues that may arise during the migration, minimizing disruptions and ensuring a seamless transition. By maintaining the ability to switch between the AI Gateway and direct provider calls, we can quickly revert to the original setup if necessary, providing an additional layer of safety and flexibility. This phased approach also facilitates collaboration among team members, allowing for parallel testing and validation efforts. The incremental migration strategy minimizes the impact on existing workflows and ensures that the system remains stable throughout the process.

Migration Phases

Phase 1: Setup (15 minutes)
β”œβ”€β”€ Add AI Gateway API key
β”œβ”€β”€ Update environment variables
└── Add feature flag

Phase 2: Implementation (30 minutes)
β”œβ”€β”€ Create Gateway generator classes
β”œβ”€β”€ Update generator factory with feature flag
└── Preserve custom settings (temperature, system prompts)

Phase 3: Testing (1-2 hours)
β”œβ”€β”€ Enable gateway for one model (e.g., GPT-4o)
β”œβ”€β”€ Test message generation, streaming, citations
β”œβ”€β”€ Monitor logs and errors
└── Compare responses with direct provider

Phase 4: Gradual Rollout (1 week)
β”œβ”€β”€ Enable for one provider at a time
β”œβ”€β”€ Monitor performance and errors
β”œβ”€β”€ Collect feedback
└── Adjust as needed

Phase 5: Full Migration (after validation)
β”œβ”€β”€ Enable for all models
β”œβ”€β”€ Remove individual provider API keys (optional)
└── Clean up old code (optional)

Prerequisites

Before we jump into the integration, there are a few things we need to take care of. First, we need to obtain an AI Gateway API Key, verify our AI SDK version, and backup our current configuration. These prerequisites are crucial for ensuring a smooth integration process and minimizing potential issues. The API key is your access pass to the Vercel AI Gateway, so it’s super important to keep it safe and secure. Checking the AI SDK version ensures that we're using a compatible version that supports the gateway features. Lastly, backing up our current configuration provides a safety net, allowing us to revert to the original setup if anything goes wrong during the integration. This methodical approach ensures that we have all our bases covered before making any changes to our system. By completing these prerequisites, we lay the groundwork for a successful integration and reduce the risk of encountering unexpected problems down the line. This proactive approach not only saves time but also builds confidence in the integrity and stability of our integration process. It also helps in identifying potential compatibility issues early on, preventing them from escalating into more significant challenges.

1. Obtain AI Gateway API Key

  1. Navigate to Vercel Dashboard
  2. Go to AI Gateway tab
  3. Select API Keys from the sidebar
  4. Click Create Key
  5. Copy and securely store the key

2. Verify AI SDK Version

AI Gateway requires AI SDK v5.0.36 or later. Check your version:

npm list ai

If needed, update:

npm install ai@latest

3. Backup Current Configuration

Before making changes:

# Create a branch for this work
git checkout -b feature/ai-gateway-integration

# Backup generator.ts
cp lib/server/conversation-context/generator.ts lib/server/conversation-context/generator.ts.backup

Step-by-Step Integration

Alright, let's get our hands dirty with the integration! This section provides a step-by-step guide to integrating Vercel AI Gateway into Base Chat. We'll start by adding environment variables, then move on to updating the generator implementation. Each step is designed to be straightforward, so we can gradually introduce the AI Gateway without disrupting our existing setup. We'll also cover special handling for Groq models to ensure everything works smoothly. By following these steps, we'll be able to leverage the benefits of AI Gateway, such as simplified key management and improved reliability, while maintaining the integrity of our application. This methodical approach ensures that every aspect of the integration is carefully considered, minimizing the risk of errors and maximizing the chances of a successful implementation. The detailed instructions and clear examples in this section will help you navigate the integration process with confidence, making it easier to adopt the Vercel AI Gateway and enhance your Base Chat application.

Step 1: Add Environment Variables

Add to your .env.local file:

# Vercel AI Gateway
AI_GATEWAY_API_KEY=your_ai_gateway_api_key_here

# Feature flag to enable AI Gateway (optional, for gradual rollout)
USE_AI_GATEWAY=false

# Optional: Specify which providers to route through gateway
# Comma-separated list: "openai,anthropic,google,groq" or "all"
AI_GATEWAY_PROVIDERS=openai

Update env.example to document the new variables:

# Add at the end of env.example

# Vercel AI Gateway (optional)
# Get your API key from https://vercel.com/dashboard/ai-gateway
AI_GATEWAY_API_KEY=

# Enable AI Gateway for all providers (true) or use direct provider calls (false)
USE_AI_GATEWAY=false

# Specify which providers should use the gateway: "openai,anthropic,google,groq" or "all"
# Only used when USE_AI_GATEWAY=true
AI_GATEWAY_PROVIDERS=all

Step 2: Update Generator Implementation

Modify lib/server/conversation-context/generator.ts:

// Add at the top of the file, after existing imports
import { createGateway } from "ai";

// Add feature flag helpers after imports
const USE_AI_GATEWAY = process.env.USE_AI_GATEWAY === "true";
const AI_GATEWAY_PROVIDERS = process.env.AI_GATEWAY_PROVIDERS || "all";

function shouldUseGateway(provider: string): boolean {
  if (!USE_AI_GATEWAY) return false;
  if (AI_GATEWAY_PROVIDERS === "all") return true;
  return AI_GATEWAY_PROVIDERS.split(",").map(p => p.trim()).includes(provider);
}

// Create gateway instance (singleton)
let gatewayInstance: ReturnType<typeof createGateway> | null = null;

function getGatewayInstance() {
  if (!gatewayInstance && process.env.AI_GATEWAY_API_KEY) {
    gatewayInstance = createGateway({
      apiKey: process.env.AI_GATEWAY_API_KEY,
    });
  }
  return gatewayInstance;
}

// Update each generator class to support gateway routing
export class AnthropicGenerator extends SortedMessageGenerator {
  protected _languageModelFactory = (model: string) => {
    if (shouldUseGateway("anthropic")) {
      const gateway = getGatewayInstance();
      if (gateway) {
        console.log(`[AI Gateway] Using gateway for Anthropic model: ${model}`);
        return gateway(`anthropic/${model}`);
      }
    }
    console.log(`[Direct] Using direct Anthropic provider for model: ${model}`);
    return anthropic(model);
  };
}

export class GoogleGenerator extends SortedMessageGenerator {
  protected _languageModelFactory = (model: string) => {
    if (shouldUseGateway("google")) {
      const gateway = getGatewayInstance();
      if (gateway) {
        console.log(`[AI Gateway] Using gateway for Google model: ${model}`);
        return gateway(`google/${model}`);
      }
    }
    console.log(`[Direct] Using direct Google provider for model: ${model}`);
    return google(model);
  };
}

export class GroqGenerator extends AbstractGenerator {
  protected _languageModelFactory = (model: string) => {
    if (shouldUseGateway("groq")) {
      const gateway = getGatewayInstance();
      if (gateway) {
        // Groq models are already in creator/model format (e.g., openai/gpt-oss-20b)
        // Use them directly without groq/ prefix
        console.log(`[AI Gateway] Using gateway for Groq model: ${model}`);
        return gateway(model);  // NO groq/ prefix!
      }
    }
    console.log(`[Direct] Using direct Groq provider for model: ${model}`);
    return groq(model);
  };

  protected _getProviderOptions() {
    // Ensure AI Gateway routes to Groq infrastructure
    return {
      gateway: {
        only: ['groq']
      }
    };
  }

  async generateObject(context: GenerateContext) {
    const model = this._languageModelFactory(this.model);
    const messages = filterMessages(context.messages);

    const { object } = await generateObject({
      messages,
      model,
      temperature: this._getTemperature(),
      system: this._getSystem(),
      output: "object",
      schema: createConversationMessageResponseSchema,
      // Include provider options when using gateway
      ...(shouldUseGateway("groq") ? {
        providerOptions: this._getProviderOptions()
      } : {})
    });

    return object;
  }

  generateStream(context: GenerateContext, options: GenerateStreamOptions) {
    return streamObject({
      messages: filterMessages(context.messages),
      model: this._languageModelFactory(this.model),
      temperature: this._getTemperature(),
      system: this._getSystem(),
      schema: createConversationMessageResponseSchema,
      onFinish: options.onFinish,
      // Include provider options when using gateway
      ...(shouldUseGateway("groq") ? {
        providerOptions: this._getProviderOptions()
      } : {})
    });
  }
}

export class OpenAIGenerator extends AbstractGenerator {
  protected _languageModelFactory = (model: string) => {
    if (shouldUseGateway("openai")) {
      const gateway = getGatewayInstance();
      if (gateway) {
        console.log(`[AI Gateway] Using gateway for OpenAI model: ${model}`);
        return gateway(`openai/${model}`);
      }
    }
    console.log(`[Direct] Using direct OpenAI provider for model: ${model}`);
    return openai(model);
  };
}

Important Notes:

  1. Preserves All Existing Functionality: The _getTemperature() and _getSystem() methods remain unchanged, ensuring custom settings are preserved
  2. Feature Flag Controlled: Gateway is only used when USE_AI_GATEWAY=true
  3. Granular Control: Can enable gateway for specific providers via AI_GATEWAY_PROVIDERS
  4. Logging: Console logs help track which path is being used
  5. Fallback: If gateway fails to initialize, falls back to direct provider

Step 3: Special Handling for Groq Models

Important: Groq models require different handling than other providers because they're open-weight models that AI Gateway routes by creator/model format.

Why Groq is Different

Your Groq models are named like:

  • openai/gpt-oss-20b
  • openai/gpt-oss-120b
  • meta-llama/llama-4-scout-17b-16e-instruct
  • moonshotai/kimi-k2-instruct-0905

These names are already in creator/model format, which is exactly what AI Gateway expects. Unlike other providers (where you add a prefix like anthropic/ or google/), Groq models should be passed directly to the gateway without a groq/ prefix.

Provider Options for Groq

To ensure AI Gateway routes requests to Groq's infrastructure (rather than alternative providers), the GroqGenerator includes provider options:

providerOptions: {
  gateway: {
    only: ['groq']  // Ensures Groq backend is used
  }
}

This guarantees users get Groq's speed and performance while benefiting from AI Gateway's monitoring and reliability.

What Happens Behind the Scenes

When you use gateway with Groq:

  1. Without only option: Gateway might route openai/gpt-oss-20b to any available provider (Groq, Baseten, Cerebras, etc.)
  2. With only: ['groq']: Gateway specifically uses Groq's infrastructure
  3. Result: You get Groq's performance + AI Gateway's benefits (monitoring, reliability, caching)

Step 4: Verify Custom Settings Are Preserved

Your existing custom model configurations in lib/llm/types.ts will continue to work because:

  1. Temperature: The _getTemperature() method in AbstractGenerator reads from PROVIDER_CONFIG unchanged
  2. System Prompts: The _getSystem() method reads from modelConfigs.systemPrompt unchanged
  3. Model Names: The actual model identifiers (like claude-sonnet-4-5-20250929) remain the same

The AI Gateway simply acts as a transport layer and forwards these settings to the actual provider.

Example for GPT-5 (which has a custom system prompt):

// lib/llm/types.ts (no changes needed)
modelConfigs: {
  "gpt-5": {
    temperature: 1,
    systemPrompt: GPT_5_PROMPT  // βœ… Still applied
  },
}

// When calling through gateway:
// 1. Generator creates gateway model: gateway('openai/gpt-5')
// 2. _getTemperature() returns 1 (from config)
// 3. _getSystem() returns GPT_5_PROMPT (from config)
// 4. AI SDK passes these to gateway, which forwards to OpenAI

Preserving Custom Model Settings

One of our top priorities is ensuring that all your custom model settings are preserved during the integration. This includes temperature configurations, system prompts, and any other specific settings you've fine-tuned for your models. We want to make sure that the transition to the AI Gateway is seamless and that your models continue to perform as expected. By maintaining these custom settings, we can minimize disruption and ensure that the integration enhances, rather than hinders, the performance of your application. This section will delve into how we achieve this, providing clear examples and explanations to give you confidence that your configurations are in safe hands. The ability to preserve custom model settings is critical for maintaining the quality and consistency of AI-generated content, and we’ve designed our integration process to prioritize this aspect. This attention to detail ensures that the benefits of the AI Gateway can be realized without sacrificing the unique characteristics of your models.

Temperature Configuration

All temperature settings from PROVIDER_CONFIG are automatically preserved:

// Example from lib/llm/types.ts
modelConfigs: {
  "gpt-4o": { temperature: 0.3 },
  "gpt-5": { temperature: 1 },
}

// How it works with gateway:
generateStream(context, options) {
  return streamObject({
    messages: filterMessages(context.messages),
    model: this._languageModelFactory(this.model),  // Uses gateway
    temperature: this._getTemperature(),  // βœ… Still reads from PROVIDER_CONFIG
    system: this._getSystem(),
    schema: createConversationMessageResponseSchema,
    onFinish: options.onFinish,
  });
}

System Prompt Configuration

Special system prompts for Llama, Kimi K2, and GPT-5 continue to work:

// lib/llm/types.ts (unchanged)
export const SPECIAL_LLAMA_PROMPT = `It is extremely important that you only respond...`;
export const KIMI_K2_PROMPT = `You must respond with a valid JSON object...`;
export const GPT_5_PROMPT = `It is extremely important that you only respond...`;

// Applied in generator.ts via _getSystem()
protected _getSystem(): string | undefined {
  const config = getModelConfig(this.model);
  return config?.systemPrompt;  // βœ… Returns custom prompt if configured
}

The AI Gateway transparently forwards these prompts to the underlying provider.

Provider-Specific Options

If you need to pass provider-specific options (advanced use case), you can use the providerOptions parameter:

// Example: Passing Anthropic-specific options through gateway
generateStream(context, options) {
  return streamObject({
    messages: filterMessages(context.messages),
    model: this._languageModelFactory(this.model),
    temperature: this._getTemperature(),
    system: this._getSystem(),
    schema: createConversationMessageResponseSchema,
    onFinish: options.onFinish,
    // Optional: Provider-specific options
    providerOptions: {
      anthropic: {
        // Anthropic-specific settings
      },
    },
  });
}

Note: This is typically not needed as the gateway handles provider routing automatically.


Testing Strategy

Testing, testing, 1, 2, 3! We need a solid testing strategy to ensure our AI Gateway integration is rock solid. We’ll be doing this in phases, starting with local testing on one model, then expanding to provider-by-provider testing, and finally, production testing. This multi-layered approach helps us catch any issues early and ensures a smooth rollout. Each phase focuses on different aspects of the integration, from basic functionality to performance and cost. By meticulously testing at each stage, we can build confidence in the reliability and efficiency of our setup. This also allows us to gather valuable feedback and make necessary adjustments along the way, ensuring that the final integration meets our requirements and performs optimally. The goal is to minimize surprises and ensure that our users have a seamless experience with the AI-powered features in Base Chat.

Phase 1: Local Testing with One Model

  1. Enable Gateway for OpenAI Only

    # .env.local
    AI_GATEWAY_API_KEY=your_key_here
    USE_AI_GATEWAY=true
    AI_GATEWAY_PROVIDERS=openai
    
  2. Start the Development Server

    npm run dev
    
  3. Test GPT-4o Messages

    • Navigate to a conversation
    • Select GPT-4o as the model
    • Send a test message
    • Verify in console logs: [AI Gateway] Using gateway for OpenAI model: gpt-4o
    • Confirm response quality and citations
  4. Check for Errors

    # Monitor server logs for errors
    # Look for:
    # - "AI Gateway" log messages
    # - Any authentication errors
    # - Response streaming errors
    
  5. Compare with Direct Provider

    # Disable gateway temporarily
    USE_AI_GATEWAY=false
    
    # Send same message with GPT-4o
    # Compare:
    # - Response quality
    # - Response time
    # - Citations accuracy
    

Phase 2: Provider-by-Provider Testing

Test each provider independently before enabling all:

# Week 1: OpenAI
AI_GATEWAY_PROVIDERS=openai

# Week 2: Anthropic
AI_GATEWAY_PROVIDERS=anthropic

# Week 3: Google
AI_GATEWAY_PROVIDERS=google

# Week 4: Groq
AI_GATEWAY_PROVIDERS=groq

# After all validated: Enable all
AI_GATEWAY_PROVIDERS=all

Phase 3: Production Testing

  1. Deploy to Staging

    # Set environment variables in Vercel dashboard
    vercel env add AI_GATEWAY_API_KEY production
    vercel env add USE_AI_GATEWAY production
    vercel env add AI_GATEWAY_PROVIDERS production
    
    # Deploy
    vercel --prod
    
  2. Monitor Metrics

    • Response times
    • Error rates
    • User feedback
    • Cost per request
  3. A/B Testing (Optional)

    You could implement tenant-level feature flags to test gateway with a subset of users:

    // Example: Enable gateway for specific tenants
    function shouldUseGatewayForTenant(tenantId: string): boolean {
      const gatewayEnabledTenants = process.env.AI_GATEWAY_TENANT_IDS?.split(',') || [];
      return gatewayEnabledTenants.includes(tenantId);
    }
    

Test Checklist

  • [ ] Message streaming works correctly
  • [ ] Citations/sources are properly retrieved and displayed
  • [ ] All models respond as expected
  • [ ] Custom temperatures are applied (test with GPT-5 temp=1 vs GPT-4o temp=0.3)
  • [ ] Special system prompts work (Llama, Kimi K2, GPT-5)
  • [ ] Error handling works (test with invalid model names)
  • [ ] Response times are acceptable (<2s for first token)
  • [ ] Costs are within expected range
  • [ ] No console errors or warnings
  • [ ] Agentic retrieval still works (if using)
  • [ ] Multi-tenant isolation works correctly

Rollback Plan

Stuff happens, right? So, we need a solid rollback plan in case we hit any snags during the integration. This section outlines the steps to quickly revert to the previous state if necessary. Whether it's an immediate rollback with no code changes or a full rollback to remove the gateway code, we've got you covered. Having a well-defined rollback plan is crucial for maintaining the stability and reliability of our application. It provides a safety net, allowing us to experiment with new technologies and integrations without fear of causing lasting disruptions. This proactive approach minimizes the impact of any potential issues and ensures that we can quickly restore functionality if something goes wrong. The rollback plan also includes criteria for making the decision to revert, helping us to respond decisively and effectively to any unforeseen problems. This comprehensive approach to risk management is essential for ensuring a smooth and successful integration process.

If you encounter issues, rollback is simple:

Immediate Rollback (No Code Changes)

# .env.local
USE_AI_GATEWAY=false

Restart the server. All requests will use direct provider calls.

Partial Rollback (Specific Provider)

# Disable gateway for problematic provider
AI_GATEWAY_PROVIDERS=openai,google  # Excludes anthropic and groq

Full Rollback (Remove Gateway Code)

# Restore original generator.ts
git checkout main -- lib/server/conversation-context/generator.ts

# Remove environment variables
unset AI_GATEWAY_API_KEY
unset USE_AI_GATEWAY
unset AI_GATEWAY_PROVIDERS

# Restart server
npm run dev

Rollback Decision Criteria

Consider rolling back if:

  1. Error Rate Spike: >5% increase in errors
  2. Performance Degradation: >500ms increase in p95 latency
  3. Cost Increase: >20% increase in per-request costs
  4. Functionality Break: Any critical feature stops working
  5. User Complaints: Multiple reports of quality degradation

Cost Considerations

Let's talk money! Cost considerations are a crucial part of any integration, and the AI Gateway is no exception. We need to understand the pricing structure, analyze potential costs, and explore optimization tips to ensure we’re getting the most bang for our buck. This section will break down the AI Gateway pricing, discuss how it might impact our budget, and provide practical recommendations for managing and optimizing costs. By carefully considering these factors, we can ensure that the AI Gateway integration is not only technically sound but also financially sustainable. The goal is to leverage the benefits of the gateway without incurring unexpected expenses, allowing us to maximize the value of our AI investments. This proactive approach to cost management is essential for making informed decisions and ensuring the long-term success of our integration.

AI Gateway Pricing

As of 2025, Vercel AI Gateway pricing is typically:

  • Free Tier: Limited requests per month (check current limits)
  • Pro Tier: Pay-per-request pricing (typically fractions of a cent per request)
  • Enterprise: Custom pricing with volume discounts

Important: Check Vercel's pricing page for current AI Gateway costs.

Cost Analysis

Scenario: 1 million messages/month

Without AI Gateway:
- Provider costs: $X (your current cost)
- Management overhead: Developer time

With AI Gateway:
- Provider costs: $X (same, gateway passes through)
- Gateway fees: ~$Y per million requests
- Total: $X + $Y

Additional value:
- Reduced developer time (single key management)
- Improved reliability (automatic retries)
- Better monitoring (centralized dashboard)

Recommendations

  1. Start Small: Test with low-volume tenants first
  2. Monitor Closely: Track costs in Vercel dashboard
  3. Set Alerts: Configure spending alerts to prevent surprises
  4. Compare Monthly: Review provider costs vs. gateway fees
  5. Optimize: Use gateway caching features to reduce costs

Cost Optimization Tips

  1. Enable Caching: AI Gateway can cache identical requests

    // Future enhancement: Add cache headers
    const gateway = createGateway({
      apiKey: process.env.AI_GATEWAY_API_KEY,
      cache: { enabled: true, ttl: 3600 },  // Example
    });
    
  2. Rate Limiting: Prevent runaway costs

  3. Model Selection: Use cheaper models for simple queries

  4. Batch Requests: Group requests when possible


Troubleshooting

Okay, let's face it, things can go wrong. That's why we have this troubleshooting section. Here, we’ll cover common issues you might encounter during the AI Gateway integration and how to solve them. From API key errors to slow responses and custom prompt problems, we've got solutions. Think of this as your go-to resource when things aren't working as expected. By addressing potential problems head-on, we can minimize disruptions and keep our integration on track. This section is designed to be practical and easy to use, providing step-by-step instructions and clear explanations to help you quickly resolve any issues that may arise. The goal is to empower you to troubleshoot effectively and ensure a smooth and successful AI Gateway integration.

Issue: