How Lehigh University’s The Brown and White Made Over a Century of Student Journalism Instantly Searchable with CustomGPT.ai

"We wanted the opportunity to be able to add podcast episodes and other multimedia content. So that was something in CustomGPT that stood out to us."
Nina Cialone
Student at Lehigh University

1400+

data formats supported

No-code

AI chatbot built without coding

400 M+

words indexed

Summary

Lehigh University revolutionized student access to its historical newspaper archive by building an AI-driven chatbot with CustomGPT.ai. Led by student Nina Cialone, the project integrated over 400 million words from The Brown and White newspaper using CustomGPT’s sitemap tools and no-code platform. This paved the way for integrating AI into the university’s editorial workflow.

Industry

Education, Nonprofit

Use Case

Knowledge as a Service, Student Education

Teams

Teaching

About Nina Cialone and Lehigh’s Newspaper

Nina Cialone is a senior studying cognitive science at Lehigh University in Bethlehem, PA. In her downtime, Nina writes for Lehigh’s newspaper, The Brown and White, a publication so storied it dates back to the 19th century. Nina has also been covering AI for the publication Don’t Count Us Out Yet, published on Substack, which covers all things new technology.

This past semester, Craig Gordon, Nina’s mentor, gave her the task of a lifetime. Mr. Gordon challenged Nina to build an Artificial Intelligence agent trained on the entire archive of the Brown and White. This presented the prospect of bringing together Nina’s two passions, generative AI and journalism. As Nina embarked on this project, she began to realize an immense opportunity that could change the landscape of journalistic research for years to come.

The Community Pages by The Brown and White The Brown and White

What challenge did Nina and Craig face?

Nina and Craig aimed to create a no-code AI chatbot trained on over a century of newspaper content, more than 300 million words. The agent needed to support journalistic research, provide structured access to articles in multiple formats, and eventually include podcasts and multimedia. They were unsure if a solution existed that could handle the scale and complexity of the archive.

How did they solve it with CustomGPT.ai?

CustomGPT.ai enabled Nina to:

  • Ingest large volumes of historical content using sitemap generation and crawling tools.
  • Index and organize content from multiple formats (over 1400 supported).
  • Customize the chatbot’s Persona and fine-tune it using beta tester feedback.
  • Deploy the AI agent into digital platforms, including Slack, without writing any code.

“The specific tools to help create a sitemap were immensely helpful for us because of the way that our archive is set up. Instead of many hours of copying and pasting, all I had to do was just copy and paste the whole thing right into CustomGPT’s tool.”

Nina Cialone
Nina Cialone
Student Writer for The Brown and The White

What were Nina’s results?

  • Indexed over 400 million words from The Brown and White newspaper archive.
  • Began integrating podcast episodes and other multimedia into the dataset.
  • Created a working AI chatbot in a no-code environment.
  • Beta tested the agent with editors and advisors to refine accuracy.
  • Prepared for deployment via Slack for editorial use.
Screenshot 2024 04 11 10.54.14 AM
Screenshot 2024 04 11 10.54.30 AM
Screenshot 2024 04 11 10.55.46 AM
Screenshot 2024 04 11 10.55.00 AM

Why The Brown and White chose CustomGPT.ai

Selection criterionWhy it matteredHow CustomGPT.ai delivered
Sitemap ingestionArchive spread across many URLs on a structured siteAutomated sitemap crawl replaced hours of manual copy-paste
Scale (400M+ words)Volume exceeded most no-code toolsProcessed the full corpus without custom infrastructure
Anti-hallucinationJournalism requires cited, verifiable factsRAG grounding with source citations; declines when unsure
No-codeStudent builder, no engineering backgroundFull configuration through the UI
1,400+ format supportPodcast/multimedia expansion plannedSupports audio, video, and document formats
Persona customizationDifferent query types (journalists, faculty, readers)No-code persona builder

Why it Worked

The success stemmed from a combination of CustomGPT’s powerful sitemap integration tools, support for numerous content formats, and a user-friendly, no-code interface. Nina was able to focus on curating content and shaping the agent’s Persona without needing technical expertise, making the platform ideal for education and journalism settings.

Conclusion

What started as a challenging proof-of-concept turned into a working solution that exceeded expectations. The AI agent, now trained on decades of content, is undergoing beta testing and will soon support research and editorial work for Lehigh’s student newspaper.

To read other Case Studies like this one, visit https://customgpt.ai/customers.

Frequently Asked Questions

What did The Brown and White build with CustomGPT.ai?

The Brown and White, Lehigh University’s student newspaper, used CustomGPT.ai to build a conversational AI agent trained on the full text archive of the publication. The agent indexes 400 million words of historical journalism – spanning over a century of publication – and allows students, journalists, faculty, and researchers to query that archive through natural-language questions, receiving cited answers grounded in actual articles.

How was the archive ingested into CustomGPT.ai?

Nina Cialone used CustomGPT.ai’s sitemap ingestion tools to index the archive. Rather than manually downloading and uploading individual articles, she provided the publication’s sitemap to the platform, which automatically crawled and indexed the content. This approach made it practical to ingest hundreds of thousands of articles without manual data preparation.

How large is the indexed archive?

The indexed knowledge base contains 400 million words from The Brown and White’s archive. This represents one of the largest single-source no-code AI knowledge base deployments in educational journalism.

Who built the AI agent at The Brown and White?

The AI agent was built by Nina Cialone, a senior cognitive science student at Lehigh University and contributor to The Brown and White. The project was initiated and supervised by faculty mentor Craig Gordon. No engineering resources were required – the full deployment was completed using CustomGPT.ai’s no-code platform.

What is the AI agent used for?

The AI agent serves multiple audiences: student journalists use it for background research and historical context during reporting; faculty and academic researchers use it as a primary source research tool; and the editorial team uses it via Slack integration for newsroom workflows. Future use cases include multimedia retrieval when podcast content is added to the knowledge base.

How does CustomGPT.ai prevent the AI from fabricating historical facts?

CustomGPT.ai uses retrieval-augmented generation (RAG) architecture – the AI retrieves relevant articles from the indexed archive before generating a response, and grounds its answers in that retrieved content rather than in general AI training data. Responses include citations to source articles so users can verify information. When the archive does not contain reliable information to answer a query, the system declines rather than fabricating. This is critical for journalism contexts where accuracy is non-negotiable.

Can the system be expanded to include multimedia content?

Yes. CustomGPT.ai supports over 1,400 data formats. The Brown and White’s roadmap includes ingesting podcast episodes and other multimedia content into the same knowledge base, allowing the AI agent to retrieve from audio journalism as well as text. This expansion is possible without rebuilding the existing system.

What can other universities learn from this project?

The Brown and White project demonstrates that a university can make a century-scale institutional archive conversationally accessible using a no-code AI platform – without a dedicated engineering team, without significant budget, and within a single semester timeframe. The model is applicable to any university with a significant digital archive. See CustomGPT.ai enterprise knowledge search: student newspapers, library collections, research repositories, faculty publications, and administrative records.

How is the AI agent deployed for editorial use?

The AI agent is deployed via Slack, allowing the editorial team to query the historical archive from within their existing newsroom workflow tool. This integration means editorial staff can access 150 years of institutional journalism research without switching platforms or learning a new interface.

Why is conversational archive search important for student journalism?

Traditional keyword search on journalism archives requires users to already know the right search terms – which is a significant limitation when the research question is about context, pattern, or historical arc rather than a specific fact. Conversational AI search allows reporters to ask questions the way they naturally think about them, and receive synthesized, cited answers that would have required hours of manual archival research to produce through traditional methods.

Ready to try CustomGPT.ai for yourself?

Create custom AI agent in minutes. Drive workplace productivity and enhance customer engagement with AI that knows your business.

See other success stories