CustomGPT.ai Blog

Getting Started with RAG APIs: A Practical Developer Tutorial

Author Image

Written by: Priyansh Khodiyar

Getting Started with RAG APIs: A Practical Developer Tutorial

TL;DR

This hands-on tutorial walks you through getting started with RAG APIs and building your first RAG-powered application from scratch. You’ll learn to set up a RAG API, upload your data sources, create a functional chat interface, and deploy a working application—all in under 30 minutes.

We’ll use CustomGPT as our RAG provider since they offer comprehensive developer tools, excellent documentation, and industry-leading accuracy.

By the end, you’ll have a fully functional AI assistant that can answer questions using your specific knowledge base, complete with citations and streaming responses. Perfect for developers who learn best by building real applications.

The best way to understand RAG APIs is to build something real. This tutorial will take you from zero to a working RAG application in about 30 minutes, giving you hands-on experience with the concepts, tools, and patterns you’ll use in production applications.

We’ll build a customer support assistant that can answer questions about your product documentation. The same principles apply whether you’re creating internal knowledge tools, educational assistants, or any application that needs to provide accurate, source-backed responses.

Prerequisites and Setup

Before we start coding, let’s get your development environment ready. You’ll need:

Required Tools

  • Node.js 18+ or Python 3.8+ (choose your preferred language)
  • A text editor or IDE
  • Terminal/command line access
  • A CustomGPT account (free trial available)

Optional but Recommended

  • Git for version control
  • Postman or similar API testing tool
  • Basic knowledge of REST APIs and JSON

Get Your API Credentials First, register for a CustomGPT account. The platform offers a 7-day free trial, which is perfect for this tutorial. Once registered, navigate to your dashboard and note your API key—you’ll need it throughout this tutorial.

For testing API calls directly, you can also grab the Postman collection which includes all the endpoints we’ll be using.

Step 1: Creating Your First RAG Agent

A RAG agent (sometimes called a project) is the core container for your knowledge base and AI assistant. Let’s create one programmatically to understand the underlying API structure.

Using cURL (works in any terminal):

curl -X POST "https://app.customgpt.ai/api/v1/projects" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "project_name": "Customer Support Assistant",
    "prompt": "You are a helpful customer support assistant. Always provide accurate information based on the knowledge base and include source citations.",
    "base_prompt": "Answer questions using the provided knowledge base. Be helpful, accurate, and professional."
  }'

Using JavaScript/Node.js:

const fetch = require('node-fetch'); // npm install node-fetch

async function createAgent() {
  const response = await fetch('https://app.customgpt.ai/api/v1/projects', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      project_name: 'Customer Support Assistant',
      prompt: 'You are a helpful customer support assistant. Always provide accurate information based on the knowledge base and include source citations.',
      base_prompt: 'Answer questions using the provided knowledge base. Be helpful, accurate, and professional.'
    })
  });

  const data = await response.json();
  console.log('Agent created:', data);
  return data.data.id; // Save this agent ID
}

createAgent();

Using Python:

import requests
import os

def create_agent():
    url = "https://app.customgpt.ai/api/v1/projects"
    headers = {
        "Authorization": f"Bearer {os.getenv('CUSTOMGPT_API_KEY')}",
        "Content-Type": "application/json"
    }
    data = {
        "project_name": "Customer Support Assistant",
        "prompt": "You are a helpful customer support assistant. Always provide accurate information based on the knowledge base and include source citations.",
        "base_prompt": "Answer questions using the provided knowledge base. Be helpful, accurate, and professional."
    }
    
    response = requests.post(url, headers=headers, json=data)
    result = response.json()
    print(f"Agent created: {result}")
    return result['data']['id']  # Save this agent ID

agent_id = create_agent()

Save the agent ID returned from this call—you’ll use it in all subsequent requests.

Step 2: Adding Knowledge Sources

Now let’s add some knowledge to your agent. RAG APIs can ingest various data sources. We’ll start with a simple approach and then show more advanced options.

Upload a Single Document:

async function uploadDocument(agentId, filePath) {
  const formData = new FormData();
  const fileContent = fs.readFileSync(filePath);
  const blob = new Blob([fileContent], { type: 'application/pdf' });
  
  formData.append('file', blob, 'support-guide.pdf');
  formData.append('page_url', 'support-guide.pdf');
  
  const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${agentId}/pages`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
    },
    body: formData
  });
  
  const result = await response.json();
  console.log('Document uploaded:', result);
  return result;
}

Connect a Website for Automatic Crawling:

async function addWebsite(agentId, websiteUrl) {
  const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${agentId}/pages`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      page_url: websiteUrl,
      crawl_subpages: true,
      max_depth: 3,
      include_patterns: ['*/docs/*', '*/help/*', '*/support/*'],
      exclude_patterns: ['*/admin/*', '*/login/*']
    })
  });
  
  const result = await response.json();
  console.log('Website added:', result);
  return result;
}

// Example: Add your documentation site
addWebsite(agentId, 'https://docs.yourcompany.com');

Monitor Indexing Progress:

async function checkIndexingStatus(agentId) {
  const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${agentId}/pages`, {
    headers: {
      'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
    }
  });
  
  const result = await response.json();
  const pages = result.data.pages.data;
  
  const statusSummary = pages.reduce((acc, page) => {
    acc[page.index_status] = (acc[page.index_status] || 0) + 1;
    return acc;
  }, {});
  
  console.log('Indexing status:', statusSummary);
  return statusSummary;
}

// Check every 30 seconds until indexing is complete
const checkInterval = setInterval(async () => {
  const status = await checkIndexingStatus(agentId);
  if (status.indexed && !status.indexing && !status.queued) {
    console.log('✅ All documents indexed successfully!');
    clearInterval(checkInterval);
  }
}, 30000);

Step 3: Building Your Chat Interface

With your knowledge base indexed, let’s create a functional chat interface. We’ll start with a simple version and then add advanced features.

Basic Chat Implementation:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>RAG Assistant Demo</title>
    <style>
        .chat-container {
            max-width: 600px;
            margin: 50px auto;
            border: 1px solid #ddd;
            border-radius: 8px;
            overflow: hidden;
        }
        
        .messages {
            height: 400px;
            overflow-y: auto;
            padding: 20px;
            background: #f9f9f9;
        }
        
        .message {
            margin: 10px 0;
            padding: 10px;
            border-radius: 6px;
        }
        
        .user-message {
            background: #007bff;
            color: white;
            margin-left: 20%;
        }
        
        .bot-message {
            background: white;
            margin-right: 20%;
            border: 1px solid #eee;
        }
        
        .input-area {
            display: flex;
            padding: 20px;
            background: white;
        }
        
        .input-area input {
            flex: 1;
            padding: 10px;
            border: 1px solid #ddd;
            border-radius: 4px;
        }
        
        .input-area button {
            margin-left: 10px;
            padding: 10px 20px;
            background: #007bff;
            color: white;
            border: none;
            border-radius: 4px;
            cursor: pointer;
        }
        
        .citations {
            font-size: 0.8em;
            color: #666;
            margin-top: 8px;
            padding-top: 8px;
            border-top: 1px solid #eee;
        }
        
        .citation-link {
            color: #007bff;
            text-decoration: none;
        }
    </style>
</head>
<body>
    <div class="chat-container">
        <div id="messages" class="messages"></div>
        <div class="input-area">
            <input type="text" id="messageInput" placeholder="Ask me anything about our support documentation...">
            <button onclick="sendMessage()">Send</button>
        </div>
    </div>

    <script>
        const AGENT_ID = 'YOUR_AGENT_ID_HERE';
        const API_KEY = 'YOUR_API_KEY_HERE'; // In production, use a proxy server
        
        let conversationId = null;

        async function sendMessage() {
            const input = document.getElementById('messageInput');
            const message = input.value.trim();
            
            if (!message) return;
            
            input.value = '';
            addMessageToChat(message, 'user');
            
            try {
                const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${AGENT_ID}/conversations`, {
                    method: 'POST',
                    headers: {
                        'Authorization': `Bearer ${API_KEY}`,
                        'Content-Type': 'application/json'
                    },
                    body: JSON.stringify({
                        prompt: message,
                        conversation_id: conversationId,
                        stream: false,
                        citations: true
                    })
                });
                
                const result = await response.json();
                
                if (!conversationId) {
                    conversationId = result.data.session_id;
                }
                
                addMessageToChat(result.data.response, 'bot', result.data.citations);
                
            } catch (error) {
                console.error('Error:', error);
                addMessageToChat('Sorry, I encountered an error. Please try again.', 'bot');
            }
        }

        function addMessageToChat(message, sender, citations = []) {
            const messagesContainer = document.getElementById('messages');
            const messageDiv = document.createElement('div');
            messageDiv.className = `message ${sender}-message`;
            
            let content = `<div>${message}</div>`;
            
            if (citations && citations.length > 0) {
                content += '<div class="citations">Sources: ';
                citations.forEach((citation, index) => {
                    content += `<a href="${citation.url}" target="_blank" class="citation-link">[${index + 1}] ${citation.title}</a> `;
                });
                content += '</div>';
            }
            
            messageDiv.innerHTML = content;
            messagesContainer.appendChild(messageDiv);
            messagesContainer.scrollTop = messagesContainer.scrollHeight;
        }

        // Allow Enter key to send messages
        document.getElementById('messageInput').addEventListener('keypress', function(e) {
            if (e.key === 'Enter') {
                sendMessage();
            }
        });
    </script>
</body>
</html>

Step 4: Adding Streaming Responses

Real-time streaming makes your application feel much more responsive. Here’s how to implement it:

async function sendStreamingMessage(message) {
    addMessageToChat(message, 'user');
    
    const botMessageDiv = createBotMessageDiv();
    let accumulatedResponse = '';
    
    try {
        const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${AGENT_ID}/conversations`, {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${API_KEY}`,
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({
                prompt: message,
                conversation_id: conversationId,
                stream: true,
                citations: true
            })
        });

        const reader = response.body.getReader();
        const decoder = new TextDecoder();

        while (true) {
            const { done, value } = await reader.read();
            if (done) break;

            const chunk = decoder.decode(value);
            const lines = chunk.split('\n');

            for (const line of lines) {
                if (line.startsWith('data: ')) {
                    try {
                        const data = JSON.parse(line.slice(6));
                        
                        if (data.choices && data.choices[0].delta.content) {
                            accumulatedResponse += data.choices[0].delta.content;
                            updateBotMessage(botMessageDiv, accumulatedResponse);
                        }
                        
                        if (data.citations) {
                            addCitationsToMessage(botMessageDiv, data.citations);
                        }
                        
                    } catch (e) {
                        // Skip invalid JSON lines
                    }
                }
            }
        }
        
    } catch (error) {
        console.error('Streaming error:', error);
        updateBotMessage(botMessageDiv, 'Sorry, I encountered an error. Please try again.');
    }
}

function createBotMessageDiv() {
    const messagesContainer = document.getElementById('messages');
    const messageDiv = document.createElement('div');
    messageDiv.className = 'message bot-message';
    messageDiv.innerHTML = '<div class="response-content">Thinking...</div>';
    messagesContainer.appendChild(messageDiv);
    messagesContainer.scrollTop = messagesContainer.scrollHeight;
    return messageDiv;
}

function updateBotMessage(messageDiv, content) {
    const contentDiv = messageDiv.querySelector('.response-content');
    contentDiv.textContent = content;
    messageDiv.scrollIntoView({ behavior: 'smooth' });
}

Step 5: Production-Ready Implementation

The previous examples are great for learning, but production applications need better architecture. Here’s a more robust approach:

Backend API Proxy (Node.js/Express):

// server.js
const express = require('express');
const fetch = require('node-fetch');
require('dotenv').config();

const app = express();
app.use(express.json());

// Proxy endpoint for chat
app.post('/api/chat', async (req, res) => {
    try {
        const { message, conversationId } = req.body;
        
        const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${process.env.AGENT_ID}/conversations`, {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({
                prompt: message,
                conversation_id: conversationId,
                stream: false,
                citations: true
            })
        });

        const data = await response.json();
        res.json(data);
        
    } catch (error) {
        console.error('API Error:', error);
        res.status(500).json({ error: 'Internal server error' });
    }
});

// Streaming endpoint
app.post('/api/chat/stream', async (req, res) => {
    const { message, conversationId } = req.body;
    
    res.setHeader('Content-Type', 'text/plain');
    res.setHeader('Cache-Control', 'no-cache');
    res.setHeader('Connection', 'keep-alive');
    
    try {
        const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${process.env.AGENT_ID}/conversations`, {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${process.env.CUSTOMGPT_API_KEY}`,
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({
                prompt: message,
                conversation_id: conversationId,
                stream: true,
                citations: true
            })
        });

        response.body.pipe(res);
        
    } catch (error) {
        console.error('Streaming error:', error);
        res.status(500).json({ error: 'Streaming failed' });
    }
});

app.listen(3000, () => {
    console.log('Server running on http://localhost:3000');
});

Environment Configuration (.env):

CUSTOMGPT_API_KEY=your_api_key_here
AGENT_ID=your_agent_id_here
PORT=3000

Step 6: Using the Developer Starter Kit

For faster development, consider using the CustomGPT Developer Starter Kit. It provides a complete reference implementation with advanced features:

# Clone the starter kit
git clone https://github.com/Poll-The-People/customgpt-starter-kit.git
cd customgpt-starter-kit

# Install dependencies
npm install

# Configure environment
cp .env.example .env.local
# Add your CUSTOMGPT_API_KEY to .env.local

# Start development server
npm run dev

The starter kit includes:

  • Complete chat interface with conversation management
  • Voice chat capabilities (with OpenAI integration)
  • Progressive Web App (PWA) support
  • Widget embedding options for websites
  • Production deployment configurations
  • Comprehensive documentation and examples

Advanced Features and Customization

Once you have the basics working, here are some advanced features to explore:

Custom Prompting and Persona:

const customPrompt = `
You are an expert technical support specialist for our software platform.

Guidelines:
- Always be professional and helpful
- Provide step-by-step instructions when appropriate
- If you cannot find the answer in the knowledge base, say so clearly
- Always include relevant source citations
- For technical issues, ask for system information if needed

Context: The user is asking about: ${userMessage}
`;

// Use custom prompts in your API calls
const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${AGENT_ID}/conversations`, {
    method: 'POST',
    headers: {
        'Authorization': `Bearer ${API_KEY}`,
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({
        prompt: customPrompt,
        conversation_id: conversationId
    })
});

Analytics and Monitoring:

// Track user interactions
function trackChatAnalytics(message, response, responseTime) {
    const analytics = {
        timestamp: new Date().toISOString(),
        userMessage: message,
        responseLength: response.length,
        responseTime: responseTime,
        citationsCount: response.citations?.length || 0,
        conversationId: conversationId
    };
    
    // Send to your analytics service
    fetch('/api/analytics', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(analytics)
    });
}

Multi-language Support: CustomGPT supports 90+ languages natively. You can create language-specific agents or use a single multilingual agent:

// Language detection and routing
function detectLanguage(message) {
    // Simple detection - in production, use a proper language detection service
    const patterns = {
        'es': /¿|ñ|á|é|í|ó|ú/,
        'fr': /ç|é|è|à|ù/,
        'de': /ü|ö|ä|ß/
    };
    
    for (const [lang, pattern] of Object.entries(patterns)) {
        if (pattern.test(message)) return lang;
    }
    return 'en';
}

// Use language-specific prompts
const languagePrompts = {
    'en': 'You are a helpful customer support assistant.',
    'es': 'Eres un asistente de soporte al cliente útil.',
    'fr': 'Vous êtes un assistant de support client utile.',
    'de': 'Sie sind ein hilfreicher Kundensupport-Assistent.'
};

Troubleshooting Common Issues

API Key Authentication Errors:

// Test your API key
async function testApiKey() {
    try {
        const response = await fetch('https://app.customgpt.ai/api/v1/projects', {
            headers: {
                'Authorization': `Bearer ${API_KEY}`,
            }
        });
        
        if (response.status === 401) {
            console.error('❌ Invalid API key');
            return false;
        }
        
        if (response.ok) {
            console.log('✅ API key is valid');
            return true;
        }
        
    } catch (error) {
        console.error('❌ API connection error:', error);
        return false;
    }
}

CORS Issues in Development: If you’re getting CORS errors when calling the API directly from the browser, you have two options:

  1. Use a proxy server (recommended for production)
  2. Use browser extensions for development only

Rate Limiting Handling:

async function makeApiCallWithRetry(url, options, maxRetries = 3) {
    for (let i = 0; i < maxRetries; i++) {
        try {
            const response = await fetch(url, options);
            
            if (response.status === 429) {
                const retryAfter = response.headers.get('Retry-After') || Math.pow(2, i);
                console.log(`Rate limited, retrying after ${retryAfter} seconds`);
                await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
                continue;
            }
            
            return response;
            
        } catch (error) {
            if (i === maxRetries - 1) throw error;
            await new Promise(resolve => setTimeout(resolve, Math.pow(2, i) * 1000));
        }
    }
}

Document Indexing Issues:

async function validateDocumentUpload(agentId, pageId) {
    const response = await fetch(`https://app.customgpt.ai/api/v1/projects/${agentId}/pages/${pageId}`, {
        headers: {
            'Authorization': `Bearer ${API_KEY}`,
        }
    });
    
    const data = await response.json();
    const page = data.data;
    
    console.log('Document status:', {
        crawlStatus: page.crawl_status,
        indexStatus: page.index_status,
        isFile: page.is_file,
        filename: page.filename,
        filesize: page.filesize
    });
    
    if (page.crawl_status === 'failed' || page.index_status === 'failed') {
        console.error('❌ Document processing failed');
        // Re-upload or contact support
    }
}

Next Steps and Learning Resources

Congratulations! You’ve built your first RAG-powered application. Here are some resources to continue your journey:

Expand Your Knowledge:

Production Deployment:

Advanced Integration:

Frequently Asked Questions

How long does it take for documents to be indexed after upload?

Indexing time varies based on document size and complexity. Simple text files typically index within 1-5 minutes, while large PDFs or websites may take 10-30 minutes. You can monitor progress using the pages API endpoint. CustomGPT processes documents in parallel, so adding multiple documents simultaneously doesn’t significantly increase total processing time.

Can I test my RAG API without writing code first?

Absolutely! Use the Postman collection to test API endpoints directly. This lets you experiment with prompts, test document uploads, and understand response formats before writing any code. You can also use the CustomGPT dashboard interface to test your agent’s responses interactively.

What happens if I upload duplicate documents?

CustomGPT automatically detects and handles duplicate content. If you upload the same document multiple times, the system will recognize it and avoid creating duplicate entries in your knowledge base. However, if you update a document with new content, you should re-upload it to ensure your agent has the latest information.

How do I handle documents in different formats like PowerPoint or Excel?

CustomGPT supports over 1400+ document formats including PowerPoint, Excel, Word, PDFs, and many others. The system automatically extracts text content from these files during indexing. For structured data like spreadsheets, consider converting to CSV format for better extraction results if you encounter issues.

Can I customize the citation format in responses?

Yes, you can customize citations through API parameters and custom prompting. You can specify whether to include citations, how to format them, and what information to display. The API returns structured citation data that you can format in your frontend according to your design requirements.

What’s the maximum size for uploaded documents?

Individual file uploads are typically limited to 25MB, but this can vary based on your plan. For larger documents, consider splitting them into smaller sections or using URL-based ingestion for websites. Check your specific plan limits for exact specifications.

How do I handle multilingual content in my knowledge base?

CustomGPT supports 90+ languages natively. You can mix documents in different languages within the same agent, and the system will handle queries in any supported language while searching across your multilingual knowledge base. The AI can respond in the user’s query language while referencing content in other languages.

What’s the difference between streaming and non-streaming responses?

Non-streaming responses return the complete answer at once, which is simpler to implement but can feel slow for longer responses. Streaming responses return text incrementally as it’s generated, providing a much more responsive user experience. Streaming is recommended for production applications but requires more complex frontend handling.

How do I handle errors and edge cases in production?

Implement comprehensive error handling for common scenarios: network timeouts, rate limiting (429 errors), authentication failures (401 errors), and cases where no relevant information is found. Always provide fallback responses and consider graceful degradation—perhaps falling back to general assistance or human support when the RAG system can’t help.

Can I integrate this with my existing chat platform like Slack or Discord?

Yes! CustomGPT provides various integrations including Slack, Discord, and other platforms. You can also use the API to build custom integrations with any chat platform. The developer integrations repository includes examples for popular platforms.

For more RAG API related information:

  1. CustomGPT.ai’s open-source UI starter kit (custom chat screens, embeddable chat window and floating chatbot on website) with 9 social AI integration bots and its related setup tutorials
  2. Find our API sample usage code snippets here
  3. Our RAG API’s Postman hosted collection – test the APIs on postman with just 1 click.
  4. Our Developer API documentation.
  5. API explainer videos on YouTube and a dev focused playlist
  6. Join our bi-weekly developer office hours and our past recordings of the Dev Office Hours.

P.s – Our API endpoints are OpenAI compatible, just replace the API key and endpoint and any OpenAI compatible project works with your RAG data. Find more here

Wanna try to do something with our Hosted MCPs? Check out the docs for the same.

Build a Custom GPT for your business, in minutes.

Deliver exceptional customer experiences and maximize employee efficiency with custom AI agents.

Trusted by thousands of organizations worldwide

Related posts

Leave a reply

Your email address will not be published. Required fields are marked *

*

3x productivity.
Cut costs in half.

Launch a custom AI agent in minutes.

Instantly access all your data.
Automate customer service.
Streamline employee training.
Accelerate research.
Gain customer insights.

Try 100% free. Cancel anytime.