
TL;DR
This guide demonstrates how to build production-ready RAG applications using proven implementation patterns from real-world deployments. Instead of theoretical examples, we’ll use actual code from the CustomGPT Starter Kit and working examples from the CustomGPT Cookbook.
You’ll learn to implement OpenAI SDK-compatible RAG APIs, build responsive chat interfaces, and deploy scalable applications using battle-tested architectural patterns. All examples are sourced from production implementations currently serving thousands of users.
Building RAG applications with traditional approaches often means starting from scratch, reinventing common patterns, and dealing with complex infrastructure setup. This guide takes a different approach: we’ll use proven, production-ready implementations that you can deploy today.
The CustomGPT platform provides OpenAI SDK compatibility, making it an ideal bridge between traditional LLM APIs and RAG capabilities. This means you can leverage existing OpenAI integration patterns while gaining RAG functionality.
Real-World Implementation Foundation
Rather than building from theoretical examples, we’ll use the CustomGPT Starter Kit—a production-ready implementation that includes:
- Complete chat interfaces with conversation management
- API proxy architecture with security best practices
- Widget and iframe deployment options
- Voice chat integration
- Progressive Web App (PWA) support
You can see this implementation running live at https://starterkit.customgpt.ai/.
Setting Up Your Development Environment
Prerequisites and Quick Start
Clone the actual starter kit repository to begin with a working implementation:
# Clone the production-ready starter kit
git clone https://github.com/Poll-The-People/customgpt-starter-kit.git
cd customgpt-starter-kit
# Install dependencies
npm install
# Copy environment configuration
cp .env.example .env.localEnvironment Configuration
The starter kit uses a proven environment setup pattern. Edit .env.local:
# Required: Your CustomGPT API key
CUSTOMGPT_API_KEY=your_api_key_here
# Optional: For voice features
OPENAI_API_KEY=your_openai_key_here
# Optional: Custom base URL if using self-hosted
CUSTOMGPT_API_BASE_URL=https://app.customgpt.ai/api/v1Get your API key from CustomGPT Dashboard. The platform offers a 7-day free trial perfect for development.
Development Server
# Start development server with hot reload
npm run dev
# The app will be available at http://localhost:3000Core API Integration Patterns
The starter kit demonstrates proven API integration patterns. Let’s examine the actual implementation:
OpenAI SDK Compatibility
CustomGPT’s OpenAI compatibility means you can use existing OpenAI client libraries. Here’s the actual pattern from the starter kit:
// From src/lib/api/proxy-client.ts (actual file)
class CustomGPTClient {
constructor(config) {
this.baseURL = config.baseURL || '/api/proxy';
this.apiKey = config.apiKey; // Handled server-side in proxy
}
async createChatCompletion(params) {
const response = await fetch(`${this.baseURL}/conversations`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
agentId: params.agentId,
prompt: params.messages[params.messages.length - 1].content,
stream: params.stream || false,
conversation_id: params.conversationId
})
});
if (!response.ok) {
throw new Error(`API request failed: ${response.status}`);
}
return response.json();
}
}This pattern is based on the actual implementation used in production applications. The key insight is using a proxy layer to handle authentication securely.
Server-Side API Proxy
The starter kit implements a comprehensive API proxy pattern. Here’s the actual Next.js API route structure:
// From src/app/api/proxy/[...path]/route.ts (actual file structure)
export async function POST(request, { params }) {
const path = params.path.join('/');
const body = await request.json();
// Server-side authentication - API key never exposed to client
const response = await fetch(`https://app.customgpt.ai/api/v1/${path}`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify(body)
});
return Response.json(await response.json());
}This proxy pattern ensures API keys remain secure while providing a clean client-side interface.
Working with Real Examples
The CustomGPT Cookbook contains production-tested Jupyter notebooks demonstrating real API usage patterns.
- Creating Conversations: Instead of theoretical examples, let’s look at the actual implementation from the cookbook: Create a new conversation and send a message – This notebook demonstrates the complete flow of creating conversations and handling responses.
- SDK Implementation: The cookbook also includes SDK-based examples: SDK Create conversation example
- Citation Handling: Real citation implementation from the cookbook: Get citation details and SDK version
Building Chat Interfaces
The starter kit includes multiple proven chat interface implementations:
Embedded Widget Implementation
From the actual examples/widget-example.html:
<!DOCTYPE html>
<html>
<head>
<title>CustomGPT Widget Integration</title>
</head>
<body>
<!-- Widget container -->
<div id="chat-container" style="height: 600px;"></div>
<!-- Load the actual widget script -->
<script src="/dist/widget/customgpt-widget.js"></script>
<script>
// Initialize widget with actual configuration
const widget = CustomGPTWidget.init({
agentId: 'YOUR_AGENT_ID',
containerId: 'chat-container',
mode: 'embedded',
theme: 'light',
enableCitations: true,
enableFeedback: true
});
</script>
</body>
</html>
React Integration Pattern
From examples/react-integration.jsx in the starter kit:
import React, { useState, useEffect } from 'react';
const CustomGPTChat = ({ agentId, apiKey }) => {
const [messages, setMessages] = useState([]);
const [input, setInput] = useState('');
const [loading, setLoading] = useState(false);
const sendMessage = async () => {
if (!input.trim()) return;
setLoading(true);
try {
const response = await fetch('/api/proxy/conversations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: agentId,
prompt: input
})
});
const data = await response.json();
setMessages(prev => [...prev,
{ role: 'user', content: input },
{ role: 'assistant', content: data.response, citations: data.citations }
]);
} catch (error) {
console.error('Error:', error);
} finally {
setLoading(false);
setInput('');
}
};
return (
<div className="chat-container">
<div className="messages">
{messages.map((msg, idx) => (
<div key={idx} className={`message ${msg.role}`}>
<div className="content">{msg.content}</div>
{msg.citations && (
<div className="citations">
{msg.citations.map((cite, i) => (
<a key={i} href={cite.url} target="_blank">
[{i + 1}] {cite.title}
</a>
))}
</div>
)}
</div>
))}
</div>
<div className="input-area">
<input
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
disabled={loading}
/>
<button onClick={sendMessage} disabled={loading}>
{loading ? 'Sending...' : 'Send'}
</button>
</div>
</div>
);
};
export default CustomGPTChat;Advanced Features Implementation
Voice Chat Integration
The starter kit includes working voice chat implementation. From the actual codebase:
// Voice features require OpenAI API key for speech-to-text
// Implementation in src/components/chat/VoiceInput.tsx
const VoiceInput = ({ onTranscript }) => {
const [recording, setRecording] = useState(false);
const startRecording = async () => {
// Actual implementation uses Web Speech API + OpenAI Whisper
const response = await fetch('/api/proxy/voice/transcribe', {
method: 'POST',
body: audioData
});
const { transcript } = await response.json();
onTranscript(transcript);
};
return (
<button
onClick={recording ? stopRecording : startRecording}
className={`voice-button ${recording ? 'recording' : ''}`}
>
{recording ? '⏸️' : '🎤'}
</button>
);
};Progressive Web App Support
The starter kit includes full PWA implementation with actual manifest and service worker files:
// From public/manifest.json (actual file)
{
"name": "CustomGPT Developer Starter Kit",
"short_name": "CustomGPT",
"description": "A modern chat interface for CustomGPT.ai's RAG API",
"start_url": "/",
"display": "standalone",
"background_color": "#ffffff",
"theme_color": "#6366f1",
"icons": [
{
"src": "/icons/icon-192x192.png",
"sizes": "192x192",
"type": "image/png"
}
]
}Production Deployment
Docker Deployment
The starter kit includes production-ready Docker configuration:
# From Dockerfile (actual file)
FROM node:18-alpine AS base
WORKDIR /app
# Install dependencies
COPY package*.json ./
RUN npm ci --only=production
# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:18-alpine AS runner
WORKDIR /app
# Copy built application
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/public ./public
EXPOSE 3000
CMD ["node", "server.js"]Vercel Deployment
The starter kit includes one-click Vercel deployment:
https://vercel.com/new/clone?repository-url=https://github.com/Poll-The-People/customgpt-starter-kit
Simply click deploy and add your CUSTOMGPT_API_KEY environment variable.
Analytics and Monitoring
Project Statistics
Use the actual analytics implementation from the cookbook: Get Project Stats and SDK version
Real-time Monitoring
The starter kit includes performance monitoring patterns:
// From src/lib/monitoring/performance.ts
export const trackAPICall = async (operation, duration, success) => {
// Send metrics to your monitoring service
await fetch('/api/metrics', {
method: 'POST',
body: JSON.stringify({
operation,
duration,
success,
timestamp: Date.now()
})
});
};Integration Examples
Iframe Embedding
From examples/iframe-embed-example.html:
<iframe
src="https://your-domain.com/widget/"
width="100%"
height="600px"
frameborder="0">
</iframe>
<script>
// PostMessage API for iframe communication
window.addEventListener('message', (event) => {
if (event.data.type === 'CUSTOMGPT_MESSAGE') {
console.log('Received message:', event.data.content);
}
});
</script>Widget Floating Button
From examples/react-floating-button.jsx:
const FloatingChatButton = () => {
const [isOpen, setIsOpen] = useState(false);
return (
<>
<button
className="floating-chat-button"
onClick={() => setIsOpen(true)}
style={{
position: 'fixed',
bottom: '20px',
right: '20px',
zIndex: 1000
}}
>
💬
</button>
{isOpen && (
<div className="floating-chat-widget">
<CustomGPTChat onClose={() => setIsOpen(false)} />
</div>
)}
</>
);
};Performance Optimization
Caching Implementation
The starter kit demonstrates proper caching patterns:
// From src/lib/cache/conversation-cache.ts
export const getCachedResponse = async (query, agentId) => {
const cacheKey = `${agentId}:${btoa(query)}`;
const cached = localStorage.getItem(cacheKey);
if (cached) {
const { response, timestamp } = JSON.parse(cached);
// Cache for 1 hour
if (Date.now() - timestamp < 3600000) {
return response;
}
}
return null;
};Testing and Quality Assurance
Integration Tests
The starter kit includes comprehensive test patterns:
// From tests/integration/api.test.js
describe('CustomGPT API Integration', () => {
test('should create conversation and get response', async () => {
const response = await fetch('/api/proxy/conversations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: process.env.TEST_AGENT_ID,
prompt: 'Hello, this is a test message'
})
});
expect(response.ok).toBe(true);
const data = await response.json();
expect(data.response).toBeDefined();
expect(data.citations).toBeInstanceOf(Array);
});
});Learning Resources
Cookbook Examples
Explore all working examples in the CustomGPT Cookbook:
Starter Kit Documentation
The starter kit repository includes comprehensive documentation for:
- Multiple deployment modes (standalone, widget, iframe)
- API proxy architecture
- Security best practices
- Performance optimization
- Production deployment guides
Live Demo
See the complete implementation running at https://starterkit.customgpt.ai/
For more RAG API related information:
- CustomGPT.ai’s open-source UI starter kit (custom chat screens, embeddable chat window and floating chatbot on website) with 9 social AI integration bots and its related setup tutorials.
- Find our API sample usage code snippets here.
- Our RAG API’s Postman hosted collection – test the APIs on postman with just 1 click.
- Our Developer API documentation.
- API explainer videos on YouTube and a dev focused playlist.
- Join our bi-weekly developer office hours and our past recordings of the Dev Office Hours.
P.s – Our API endpoints are OpenAI compatible, just replace the API key and endpoint and any OpenAI compatible project works with your RAG data. Find more here.
Wanna try to do something with our Hosted MCPs? Check out the docs for the same.
Frequently Asked Questions
Can you reuse existing OpenAI integration patterns when building a RAG app?
Yes. You can use an OpenAI SDK-compatible RAG API approach, which lets you keep familiar OpenAI-style integration patterns while adding retrieval-augmented capabilities.
What’s a practical way to start building a production-ready RAG application?
A practical starting point is to use production-ready implementation assets instead of building everything from scratch. The guide is based on working code examples designed for deployable RAG applications.
Do you need to build a RAG system from scratch to get started?
No. A faster path is to begin with a production-ready starter implementation and adapt it to your use case, rather than reinventing common RAG patterns and infrastructure.
Should you use theoretical examples or production-tested examples when learning RAG implementation?
Production-tested examples are generally more useful for deployment. The implementation guidance is based on real-world patterns and working examples rather than purely theoretical demos.
Can this approach support applications that need to scale beyond a small prototype?
Yes. The implementation focus is on scalable application patterns and battle-tested architecture, which is intended for production usage rather than one-off demos.
How can you reduce risk before launching a RAG app to real users?
Use implementation patterns that have already been proven in real deployments and validate your setup with working code examples before broad rollout. This reduces avoidable implementation risk compared with purely theoretical builds.
When deciding between a DIY RAG build and an OpenAI-compatible RAG platform, what’s the key tradeoff?
The core tradeoff is speed versus full infrastructure ownership. A DIY approach can offer maximum control, while an OpenAI-compatible RAG platform can accelerate delivery by reusing existing integration patterns and reducing setup complexity.
Priyansh is Developer Relations Advocate who loves technology, writer about them, creates deeply researched content about them.