Keep your chatbot’s knowledge up-to-date by reindexing specific pages with the CustomGPT RAG API. In this guide, we’ll use the Python SDK to reindex a page in a CustomGPT project. Reindexing is useful when a source document or webpage has changed and you want your bot to ingest the latest content from that page without redoing the entire project.
Before starting, ensure you’ve read Getting Started with CustomGPT.ai for New Developers to have your API key and environment ready.
Notebook Link: CustomGPT Cookbook – SDK_Reindex_page_belonging_to_a_project.ipynb
Introduction
As you update your content (be it a PDF manual or a webpage), you need your CustomGPT chatbot to reflect those changes. The reindex operation tells CustomGPT to re-fetch and re-process a given page source. This is a targeted way to update one page’s content without disturbing the rest of your project. For example, if you have a “Pricing” page that changes frequently, you could schedule a reindex for that page so the bot stays current.
In this tutorial, we’ll:
- Identify a page in a project (using its page ID).
- Call the reindex API via the SDK for that page.
- Confirm that the reindex process was started successfully.
After reindexing, the page’s content in the bot will be updated to the latest version available from the source.
Prerequisites
- CustomGPT API Key – Needed for authentication.
- CustomGPT Python SDK – Ensure customgpt-client is installed and imported.
- Project and Page ID – You should have a project with at least one page. You will need the specific page_id of the page you want to reindex. If you don’t know it, you can retrieve it using the list pages method (see the “List All Pages” tutorial).
- Page source accessible – If it’s a web page, it should be reachable (not deleted or moved). If it’s a file you previously uploaded, you might need to re-upload if content changed, but typically reindex will re-process the existing file content (for web it fetches anew).
Step-by-Step Guide
Let’s reindex a page step by step:
- Set up the SDK and authenticate.
Start by installing and importing the SDK and setting the API key:
!pip install customgpt-client
from customgpt_client import CustomGPT
CustomGPT.api_key = "YOUR_API_TOKEN"
Get API keys
To get your API key, there are two ways:
Method 1 – Via Agent
- Agent > All Agents.
- Select your agent and go to deploy, click on the API key section and create an API.
Method 2 – Via Profile section.
- Go to profile (top right corner of your screen)
- Click on My Profile
- You will see the screen something like this (below screenshot). Here you can click on “Create API key”, give it a name and copy the key.
Please save this secret key somewhere safe and accessible. For security reasons, You won’t be able to view it again through your CustomGPT.ai account. If you lose this secret key, you’ll need to generate a new one.
- With this, we are ready to make API calls.
- Create a project (if necessary) and get a page ID.
If you already have a project and know the page ID that needs reindexing, skip to step 3. Otherwise, let’s set up a quick scenario:
# Example setup: create a project with a sitemap to have a page to reindex
project_name = "Example Bot for Reindex"
sitemap_url = "https://adorosario.github.io/small-sitemap.xml"
new_project = CustomGPT.Project.create(project_name=project_name, sitemap_path=sitemap_url)
project_id = new_project.parsed.data.id
# List pages to get one page ID
pages = CustomGPT.Page.list(project_id=project_id)
first_page_id = pages.parsed.data.data[0].id
print("First page ID:", first_page_id)
- We created a project from a sitemap (which will add pages from that site). Then we listed pages and took the first page’s ID as an example target for reindexing. In a real case, determine the page_id of the content you want to refresh (for example, via Page.list or from the dashboard’s URL when viewing a page’s details).
- Call the reindex API for that page.
Now that we have a project_id and a page_id, let’s trigger the reindex:
page_id_to_reindex = first_page_id # or your known page ID
reindex_response = CustomGPT.Page.reindex(project_id=project_id, page_id=page_id_to_reindex)
print(reindex_response)
- The CustomGPT.Page.reindex method makes a request to reindex the specified page of the given project. We print the response to see the result.
- Examine the response.
If the call was successful, the response should indicate that the reindex process started. You might see something like:
{
"status": "success",
"data": {
"updated": true
}
}
- Or possibly a message that the reindex was initiated. The key part is that you get a “success” status (or HTTP 200 OK via the SDK) which means the request was accepted. The actual reindexing happens in the background. The response “updated”: true likely means the page is scheduled for reindex.
If there was an error (e.g., invalid page_id or permission issue), the status might be “error” and an error message will be provided (for instance, “Page not found in project” or similar). - (Optional) Verify the reindexing.
Since reindexing takes some time to fetch and process content, you won’t get an immediate result with new content in the same call. However, you can verify it by:
- Checking the page’s status via Page.list after a short delay to see if it shows as updating or refreshed.
- Using the conversation API to ask a question specifically from that page’s content to ensure it reflects changes (this requires knowledge of what changed).
- Or looking at the CustomGPT dashboard: the page might show a recent indexed timestamp after it completes.
- Checking the page’s status via Page.list after a short delay to see if it shows as updating or refreshed.
- For example, one could do:
import time
time.sleep(10) # wait for reindex to complete (time depends on content size)
pages_after = CustomGPT.Page.list(project_id=project_id)
for p in pages_after.parsed.data.data:
if p.id == page_id_to_reindex:
print("Page status after reindex:", p.status, "updated at", p.updated_at)
- This is pseudo-code; adjust based on actual fields. Essentially, we wait a bit and then see if the page’s status or update timestamp changed. In many cases for small pages, reindexing is quick.
- Use the updated content.
Once reindexing is done, your bot will use the new content of that page for any queries. You have successfully refreshed that part of your chatbot’s knowledge. Repeat this process for any page that requires updates.
That’s the reindex workflow in a nutshell: identify page -> call reindex -> confirm it started -> new content gets integrated.
FAQs
What exactly does “reindexing” a page do in CustomGPT?
Reindexing tells CustomGPT to re-process the content of that page. If the page is a URL, it will fetch the latest version of that webpage and update the internal index (the vector embeddings used for retrieval). If the page is from an uploaded file, CustomGPT will re-process the file you originally uploaded. Note that for file sources, if the file’s content changed, you would need to upload the new file content first (the API has methods to update sources). Reindex is most straightforward for web content, or in cases where CustomGPT can re-fetch the source. Essentially, it ensures the chatbot’s knowledge for that page is up-to-date.
How long does reindexing take?
It depends on the size of the content and the system load. Small webpages or documents might reindex in a few seconds. Larger documents or very long web pages could take longer (tens of seconds or a minute). The API call returns immediately after scheduling the reindex. You don’t get notified when it’s done through the API, but you can poll the page status or just trust that within a short time it’s updated. In practice, you’ll often just initiate reindex and the next user query that needs that content will find it updated by then.
How do I know which page ID corresponds to the content I want to refresh?
You can list pages in your project using the CustomGPT.Page.list method (as shown in the “List All Pages” tutorial). Look for identifiers or URLs/names in that output. Pages from websites will show their URL, so you can match by that. Pages from files will show the file name. Once you identify the correct page, use its id in the reindex call. In the CustomGPT web portal, if you click on a source or page, the URL might contain the page ID as well, which can help you match it.
Will reindexing a page cost additional credits or tokens?
Typically, yes. Reindexing essentially re-ingests content, which likely counts towards your usage (tokenization and embedding operations). If the content is large, it will consume similar resources as when you first added it. Keep this in mind if you schedule frequent reindexes for very large documents. However, for most use cases (like occasional updates of key pages), the cost is minor compared to the value of having up-to-date answers.
What if reindexing fails?
If reindexing fails (maybe the source URL was unreachable or returned an error), the API might return a failure status, or the page status might eventually mark as failed. In such cases, check the source:
If a URL, ensure it’s correct and live. You might try again later if it was a temporary issue.
If a file, ensure the file content is still present. (In some systems, if you deleted the file source, reindex might not have anything to fetch). You can also use the CustomGPT dashboard to see if any error message is associated with the source. Fix the issue (e.g., correct the URL or re-upload content) and try reindexing again.
Related Posts
- SDK: List All Pages of a Project with CustomGPT API – Learn how to get the list of page IDs and details, which is helpful to identify which page to reindex (especially if you have many pages).
- SDK: Add a File to a Project Using CustomGPT API – If the content update involves adding a new document or replacing one, see how to add or update file sources in your project. After adding a new file, you don’t need reindex (it’s indexed on upload), but for replacing content you might use a combination of delete + add.
- SDK: Update Project Settings Using CustomGPT API – Not directly related to reindexing, but if you frequently update content, you might also tweak settings like auto-sync intervals or similar (if CustomGPT provides an auto sync feature for sitemaps, for example). This guide helps in adjusting project configurations.
- Retrieve Messages of a Conversation with CustomGPT API – After reindexing, you might test the bot’s response via the conversation API. This related tutorial shows how to retrieve conversation messages, which can be part of verifying that new content is being used in answers.
Priyansh is Developer Relations Advocate who loves technology, writer about them, creates deeply researched content about them.