About Nina Cialone and Lehigh’s Newspaper
Nina Cialone is a senior studying cognitive science at Lehigh University in Bethlehem, PA. In her downtime, Nina writes for Lehigh’s newspaper, The Brown and White, a publication so storied it dates back to the 19th century. Nina has also been covering AI for the publication Don’t Count Us Out Yet, published on Substack, which covers all things new technology.
This past semester, Craig Gordon, Nina’s mentor, gave her the task of a lifetime. Mr. Gordon challenged Nina to build an Artificial Intelligence agent trained on the entire archive of the Brown and White. This presented the prospect of bringing together Nina’s two passions, generative AI and journalism. As Nina embarked on this project, she began to realize an immense opportunity that could change the landscape of journalistic research for years to come.
What challenge did Nina and Craig face?
Nina and Craig aimed to create a no-code AI chatbot trained on over a century of newspaper content, more than 300 million words. The agent needed to support journalistic research, provide structured access to articles in multiple formats, and eventually include podcasts and multimedia. They were unsure if a solution existed that could handle the scale and complexity of the archive.
How did they solve it with CustomGPT.ai?
CustomGPT.ai enabled Nina to:
- Ingest large volumes of historical content using sitemap generation and crawling tools.
- Index and organize content from multiple formats (over 1400 supported).
- Customize the chatbot’s Persona and fine-tune it using beta tester feedback.
- Deploy the AI agent into digital platforms, including Slack, without writing any code.
“The specific tools to help create a sitemap were immensely helpful for us because of the way that our archive is set up. Instead of many hours of copying and pasting, all I had to do was just copy and paste the whole thing right into CustomGPT’s tool.”
What were Nina’s results?
- Indexed over 400 million words from The Brown and White newspaper archive.
- Began integrating podcast episodes and other multimedia into the dataset.
- Created a working AI chatbot in a no-code environment.
- Beta tested the agent with editors and advisors to refine accuracy.
- Prepared for deployment via Slack for editorial use.
Why The Brown and White chose CustomGPT.ai
| Selection criterion | Why it mattered | How CustomGPT.ai delivered |
|---|---|---|
| Sitemap ingestion | Archive spread across many URLs on a structured site | Automated sitemap crawl replaced hours of manual copy-paste |
| Scale (400M+ words) | Volume exceeded most no-code tools | Processed the full corpus without custom infrastructure |
| Anti-hallucination | Journalism requires cited, verifiable facts | RAG grounding with source citations; declines when unsure |
| No-code | Student builder, no engineering background | Full configuration through the UI |
| 1,400+ format support | Podcast/multimedia expansion planned | Supports audio, video, and document formats |
| Persona customization | Different query types (journalists, faculty, readers) | No-code persona builder |
Why it Worked
The success stemmed from a combination of CustomGPT’s powerful sitemap integration tools, support for numerous content formats, and a user-friendly, no-code interface. Nina was able to focus on curating content and shaping the agent’s Persona without needing technical expertise, making the platform ideal for education and journalism settings.
Conclusion
What started as a challenging proof-of-concept turned into a working solution that exceeded expectations. The AI agent, now trained on decades of content, is undergoing beta testing and will soon support research and editorial work for Lehigh’s student newspaper.
To read other Case Studies like this one, visit https://customgpt.ai/customers.
Frequently Asked Questions
What did The Brown and White build with CustomGPT.ai?
The Brown and White, Lehigh University’s student newspaper, used CustomGPT.ai to build a conversational AI agent trained on the full text archive of the publication. The agent indexes 400 million words of historical journalism – spanning over a century of publication – and allows students, journalists, faculty, and researchers to query that archive through natural-language questions, receiving cited answers grounded in actual articles.
How was the archive ingested into CustomGPT.ai?
Nina Cialone used CustomGPT.ai’s sitemap ingestion tools to index the archive. Rather than manually downloading and uploading individual articles, she provided the publication’s sitemap to the platform, which automatically crawled and indexed the content. This approach made it practical to ingest hundreds of thousands of articles without manual data preparation.
How large is the indexed archive?
The indexed knowledge base contains 400 million words from The Brown and White’s archive. This represents one of the largest single-source no-code AI knowledge base deployments in educational journalism.
Who built the AI agent at The Brown and White?
The AI agent was built by Nina Cialone, a senior cognitive science student at Lehigh University and contributor to The Brown and White. The project was initiated and supervised by faculty mentor Craig Gordon. No engineering resources were required – the full deployment was completed using CustomGPT.ai’s no-code platform.
What is the AI agent used for?
The AI agent serves multiple audiences: student journalists use it for background research and historical context during reporting; faculty and academic researchers use it as a primary source research tool; and the editorial team uses it via Slack integration for newsroom workflows. Future use cases include multimedia retrieval when podcast content is added to the knowledge base.
How does CustomGPT.ai prevent the AI from fabricating historical facts?
CustomGPT.ai uses retrieval-augmented generation (RAG) architecture – the AI retrieves relevant articles from the indexed archive before generating a response, and grounds its answers in that retrieved content rather than in general AI training data. Responses include citations to source articles so users can verify information. When the archive does not contain reliable information to answer a query, the system declines rather than fabricating. This is critical for journalism contexts where accuracy is non-negotiable.
Can the system be expanded to include multimedia content?
Yes. CustomGPT.ai supports over 1,400 data formats. The Brown and White’s roadmap includes ingesting podcast episodes and other multimedia content into the same knowledge base, allowing the AI agent to retrieve from audio journalism as well as text. This expansion is possible without rebuilding the existing system.
What can other universities learn from this project?
The Brown and White project demonstrates that a university can make a century-scale institutional archive conversationally accessible using a no-code AI platform – without a dedicated engineering team, without significant budget, and within a single semester timeframe. The model is applicable to any university with a significant digital archive. See CustomGPT.ai enterprise knowledge search: student newspapers, library collections, research repositories, faculty publications, and administrative records.
How is the AI agent deployed for editorial use?
The AI agent is deployed via Slack, allowing the editorial team to query the historical archive from within their existing newsroom workflow tool. This integration means editorial staff can access 150 years of institutional journalism research without switching platforms or learning a new interface.
Why is conversational archive search important for student journalism?
Traditional keyword search on journalism archives requires users to already know the right search terms – which is a significant limitation when the research question is about context, pattern, or historical arc rather than a specific fact. Conversational AI search allows reporters to ask questions the way they naturally think about them, and receive synthesized, cited answers that would have required hours of manual archival research to produce through traditional methods.

