CustomGPT.ai Blog

Building RAG Applications with OpenAI API: Step-by-Step Implementation Guide

Building RAG Applications with OpenAI API guide shows a DEVELOPER GUIDE banner and robot at laptop on binary backdrop

TL;DR

This guide demonstrates how to build production-ready RAG applications using proven implementation patterns from real-world deployments. Instead of theoretical examples, we’ll use actual code from the CustomGPT Starter Kit and working examples from the CustomGPT Cookbook.

You’ll learn to implement OpenAI SDK-compatible RAG APIs, build responsive chat interfaces, and deploy scalable applications using battle-tested architectural patterns. All examples are sourced from production implementations currently serving thousands of users.

Building RAG applications with traditional approaches often means starting from scratch, reinventing common patterns, and dealing with complex infrastructure setup. This guide takes a different approach: we’ll use proven, production-ready implementations that you can deploy today.

The CustomGPT platform provides OpenAI SDK compatibility, making it an ideal bridge between traditional LLM APIs and RAG capabilities. This means you can leverage existing OpenAI integration patterns while gaining RAG functionality.

Real-World Implementation Foundation

Rather than building from theoretical examples, we’ll use the CustomGPT Starter Kit—a production-ready implementation that includes:

  • Complete chat interfaces with conversation management
  • API proxy architecture with security best practices
  • Widget and iframe deployment options
  • Voice chat integration
  • Progressive Web App (PWA) support

You can see this implementation running live at https://starterkit.customgpt.ai/.

Setting Up Your Development Environment

Prerequisites and Quick Start

Clone the actual starter kit repository to begin with a working implementation:

# Clone the production-ready starter kit
git clone https://github.com/Poll-The-People/customgpt-starter-kit.git
cd customgpt-starter-kit

# Install dependencies
npm install

# Copy environment configuration
cp .env.example .env.local

Environment Configuration

The starter kit uses a proven environment setup pattern. Edit .env.local:

# Required: Your CustomGPT API key
CUSTOMGPT_API_KEY=your_api_key_here

# Optional: For voice features
OPENAI_API_KEY=your_openai_key_here

# Optional: Custom base URL if using self-hosted
CUSTOMGPT_API_BASE_URL=https://app.customgpt.ai/api/v1

Get your API key from CustomGPT Dashboard. The platform offers a 7-day free trial perfect for development.

Development Server

# Start development server with hot reload
npm run dev

# The app will be available at http://localhost:3000

Core API Integration Patterns

The starter kit demonstrates proven API integration patterns. Let’s examine the actual implementation:

OpenAI SDK Compatibility

CustomGPT’s OpenAI compatibility means you can use existing OpenAI client libraries. Here’s the actual pattern from the starter kit:

// From src/lib/api/proxy-client.ts (actual file)

class CustomGPTClient {
  constructor(config) {
    this.baseURL = config.baseURL || '/api/proxy';
    this.apiKey = config.apiKey; // Handled server-side in proxy
  }
  async createChatCompletion(params) {
    const response = await fetch(`${this.baseURL}/conversations`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        agentId: params.agentId,
        prompt: params.messages[params.messages.length - 1].content,
        stream: params.stream || false,
        conversation_id: params.conversationId
      })
    });
    if (!response.ok) {
      throw new Error(`API request failed: ${response.status}`);
    }
    return response.json();
  }
}

This pattern is based on the actual implementation used in production applications. The key insight is using a proxy layer to handle authentication securely.

Server-Side API Proxy

The starter kit implements a comprehensive API proxy pattern. Here’s the actual Next.js API route structure:

// From src/app/api/proxy/[...path]/route.ts (actual file structure)
export async function POST(request, { params }) {
  const path = params.path.join('/');
  const body = await request.json();
  
  // Server-side authentication - API key never exposed to client
  const response = await fetch(`https://app.customgpt.ai/api/v1/${path}`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(body)
  });
  
  return Response.json(await response.json());
}

This proxy pattern ensures API keys remain secure while providing a clean client-side interface.

Working with Real Examples

The CustomGPT Cookbook contains production-tested Jupyter notebooks demonstrating real API usage patterns.

Building Chat Interfaces

The starter kit includes multiple proven chat interface implementations:

Embedded Widget Implementation

From the actual examples/widget-example.html:

<!DOCTYPE html>
<html>
<head>
    <title>CustomGPT Widget Integration</title>
</head>
<body>
    <!-- Widget container -->
    <div id="chat-container" style="height: 600px;"></div>
    
    <!-- Load the actual widget script -->
    <script src="/dist/widget/customgpt-widget.js"></script>
    
    <script>
        // Initialize widget with actual configuration
        const widget = CustomGPTWidget.init({
            agentId: 'YOUR_AGENT_ID',
            containerId: 'chat-container',
            mode: 'embedded',
            theme: 'light',
            enableCitations: true,
            enableFeedback: true
        });
    </script>
</body>
</html>

React Integration Pattern

From examples/react-integration.jsx in the starter kit:

import React, { useState, useEffect } from 'react';

const CustomGPTChat = ({ agentId, apiKey }) => {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);

  const sendMessage = async () => {
    if (!input.trim()) return;
    
    setLoading(true);
    
    try {
      const response = await fetch('/api/proxy/conversations', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          agentId: agentId,
          prompt: input
        })
      });
      
      const data = await response.json();
      
      setMessages(prev => [...prev, 
        { role: 'user', content: input },
        { role: 'assistant', content: data.response, citations: data.citations }
      ]);
      
    } catch (error) {
      console.error('Error:', error);
    } finally {
      setLoading(false);
      setInput('');
    }
  };

  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map((msg, idx) => (
          <div key={idx} className={`message ${msg.role}`}>
            <div className="content">{msg.content}</div>
            {msg.citations && (
              <div className="citations">
                {msg.citations.map((cite, i) => (
                  <a key={i} href={cite.url} target="_blank">
                    [{i + 1}] {cite.title}
                  </a>
                ))}
              </div>
            )}
          </div>
        ))}
      </div>
      
      <div className="input-area">
        <input 
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
          disabled={loading}
        />
        <button onClick={sendMessage} disabled={loading}>
          {loading ? 'Sending...' : 'Send'}
        </button>
      </div>
    </div>
  );
};

export default CustomGPTChat;

Advanced Features Implementation

Voice Chat Integration

The starter kit includes working voice chat implementation. From the actual codebase:

// Voice features require OpenAI API key for speech-to-text
// Implementation in src/components/chat/VoiceInput.tsx
const VoiceInput = ({ onTranscript }) => {
  const [recording, setRecording] = useState(false);
  
  const startRecording = async () => {
    // Actual implementation uses Web Speech API + OpenAI Whisper
    const response = await fetch('/api/proxy/voice/transcribe', {
      method: 'POST',
      body: audioData
    });
    
    const { transcript } = await response.json();
    onTranscript(transcript);
  };

  return (
    <button 
      onClick={recording ? stopRecording : startRecording}
      className={`voice-button ${recording ? 'recording' : ''}`}
    >
      {recording ? '⏸️' : '🎤'}
    </button>
  );
};

Progressive Web App Support

The starter kit includes full PWA implementation with actual manifest and service worker files:

// From public/manifest.json (actual file)
{
  "name": "CustomGPT Developer Starter Kit",
  "short_name": "CustomGPT",
  "description": "A modern chat interface for CustomGPT.ai's RAG API",
  "start_url": "/",
  "display": "standalone",
  "background_color": "#ffffff",
  "theme_color": "#6366f1",
  "icons": [
    {
      "src": "/icons/icon-192x192.png",
      "sizes": "192x192",
      "type": "image/png"
    }
  ]
}

Production Deployment

Docker Deployment

The starter kit includes production-ready Docker configuration:

# From Dockerfile (actual file)
FROM node:18-alpine AS base
WORKDIR /app

# Install dependencies
COPY package*.json ./
RUN npm ci --only=production

# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS runner
WORKDIR /app

# Copy built application
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/public ./public

EXPOSE 3000
CMD ["node", "server.js"]

Vercel Deployment

The starter kit includes one-click Vercel deployment:

https://vercel.com/new/clone?repository-url=https://github.com/Poll-The-People/customgpt-starter-kit

Simply click deploy and add your CUSTOMGPT_API_KEY environment variable.

Analytics and Monitoring

Project Statistics

Use the actual analytics implementation from the cookbook: Get Project Stats and SDK version

Real-time Monitoring

The starter kit includes performance monitoring patterns:

// From src/lib/monitoring/performance.ts
export const trackAPICall = async (operation, duration, success) => {
  // Send metrics to your monitoring service
  await fetch('/api/metrics', {
    method: 'POST',
    body: JSON.stringify({
      operation,
      duration,
      success,
      timestamp: Date.now()
    })
  });
};

Integration Examples

Iframe Embedding

From examples/iframe-embed-example.html:

<iframe 
  src="https://your-domain.com/widget/" 
  width="100%" 
  height="600px"
  frameborder="0">
</iframe>

<script>
  // PostMessage API for iframe communication
  window.addEventListener('message', (event) => {
    if (event.data.type === 'CUSTOMGPT_MESSAGE') {
      console.log('Received message:', event.data.content);
    }
  });
</script>

Widget Floating Button

From examples/react-floating-button.jsx:

const FloatingChatButton = () => {
  const [isOpen, setIsOpen] = useState(false);
  
  return (
    <>
      <button 
        className="floating-chat-button"
        onClick={() => setIsOpen(true)}
        style={{
          position: 'fixed',
          bottom: '20px',
          right: '20px',
          zIndex: 1000
        }}
      >
        💬
      </button>
      
      {isOpen && (
        <div className="floating-chat-widget">
          <CustomGPTChat onClose={() => setIsOpen(false)} />
        </div>
      )}
    </>
  );
};

Performance Optimization

Caching Implementation

The starter kit demonstrates proper caching patterns:

// From src/lib/cache/conversation-cache.ts
export const getCachedResponse = async (query, agentId) => {
  const cacheKey = `${agentId}:${btoa(query)}`;
  const cached = localStorage.getItem(cacheKey);
  
  if (cached) {
    const { response, timestamp } = JSON.parse(cached);
    // Cache for 1 hour
    if (Date.now() - timestamp < 3600000) {
      return response;
    }
  }
  
  return null;
};

Testing and Quality Assurance

Integration Tests

The starter kit includes comprehensive test patterns:

// From tests/integration/api.test.js
describe('CustomGPT API Integration', () => {
  test('should create conversation and get response', async () => {
    const response = await fetch('/api/proxy/conversations', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        agentId: process.env.TEST_AGENT_ID,
        prompt: 'Hello, this is a test message'
      })
    });
    
    expect(response.ok).toBe(true);
    const data = await response.json();
    expect(data.response).toBeDefined();
    expect(data.citations).toBeInstanceOf(Array);
  });
});

Learning Resources

Cookbook Examples

Explore all working examples in the CustomGPT Cookbook:

Starter Kit Documentation

The starter kit repository includes comprehensive documentation for:

  • Multiple deployment modes (standalone, widget, iframe)
  • API proxy architecture
  • Security best practices
  • Performance optimization
  • Production deployment guides

Live Demo

See the complete implementation running at https://starterkit.customgpt.ai/

For more RAG API related information:

  1. CustomGPT.ai’s open-source UI starter kit (custom chat screens, embeddable chat window and floating chatbot on website) with 9 social AI integration bots and its related setup tutorials
  2. Find our API sample usage code snippets here
  3. Our RAG API’s Postman hosted collection – test the APIs on postman with just 1 click.
  4. Our Developer API documentation.
  5. API explainer videos on YouTube and a dev focused playlist
  6. Join our bi-weekly developer office hours and our past recordings of the Dev Office Hours.

P.s – Our API endpoints are OpenAI compatible, just replace the API key and endpoint and any OpenAI compatible project works with your RAG data. Find more here

Wanna try to do something with our Hosted MCPs? Check out the docs for the same.

Frequently Asked Questions

Can you reuse existing OpenAI integration patterns when building a RAG app?

Yes. You can use an OpenAI SDK-compatible RAG API approach, which lets you keep familiar OpenAI-style integration patterns while adding retrieval-augmented capabilities.

What’s a practical way to start building a production-ready RAG application?

A practical starting point is to use production-ready implementation assets instead of building everything from scratch. The guide is based on working code examples designed for deployable RAG applications.

Do you need to build a RAG system from scratch to get started?

No. A faster path is to begin with a production-ready starter implementation and adapt it to your use case, rather than reinventing common RAG patterns and infrastructure.

Should you use theoretical examples or production-tested examples when learning RAG implementation?

Production-tested examples are generally more useful for deployment. The implementation guidance is based on real-world patterns and working examples rather than purely theoretical demos.

Can this approach support applications that need to scale beyond a small prototype?

Yes. The implementation focus is on scalable application patterns and battle-tested architecture, which is intended for production usage rather than one-off demos.

How can you reduce risk before launching a RAG app to real users?

Use implementation patterns that have already been proven in real deployments and validate your setup with working code examples before broad rollout. This reduces avoidable implementation risk compared with purely theoretical builds.

When deciding between a DIY RAG build and an OpenAI-compatible RAG platform, what’s the key tradeoff?

The core tradeoff is speed versus full infrastructure ownership. A DIY approach can offer maximum control, while an OpenAI-compatible RAG platform can accelerate delivery by reusing existing integration patterns and reducing setup complexity.

3x productivity.
Cut costs in half.

Launch a custom AI agent in minutes.

Instantly access all your data.
Automate customer service.
Streamline employee training.
Accelerate research.
Gain customer insights.

Try 100% free. Cancel anytime.