This document provides detailed instructions for demonstrating the RAG Chatbot application.
For the best demo experience, prepare 2-3 sample documents:
Option A: Use sample documents
- Download a few PDF articles or research papers
- Create a TXT file with relevant information
- Examples: Technical documentation, product guides, research papers
Option B: Create custom documents Create a simple TXT file with content like:
About RAG Systems
Retrieval-Augmented Generation (RAG) is a technique that enhances Large Language Models
by providing them with relevant context from a knowledge base. This allows the model to
generate more accurate and contextual responses.
Key benefits:
- Improved accuracy
- Source attribution
- Domain-specific knowledge
- Reduced hallucinations
# Check backend
curl http://localhost:8000/api/health
# Should return: {"status":"healthy","service":"RAG Chatbot API"}Talking Points:
- "This is a RAG Chatbot that lets you ask questions about your documents"
- "It provides answers with full source attribution"
- "Built with Next.js, FastAPI, and LangChain"
Steps:
- Navigate to the Upload tab
- Click "Choose a file" and select a sample document
- Show the file preview with size information
- Click "Upload Document"
- Point out: Success message showing number of chunks indexed
- Explain: "The document is split into chunks and converted to embeddings"
Steps:
-
Navigate to the Chat tab
-
Ask a simple question: "What is this document about?"
-
Show:
- The thinking indicator while processing
- The answer appears with proper formatting
- Sources panel updates automatically
-
Click "View X sources" link
-
Demonstrate:
- Source cards with relevance scores
- Document excerpts
- Page numbers (for PDFs)
-
Ask a more specific question related to the content
-
Show: Different sources may be retrieved for different questions
Steps:
- Navigate to the Stats tab
- Show:
- Total documents indexed
- Collection name
- Embedding model information
- Explain: "This shows how many document chunks are in the system"
Talking Points:
- "Let me show you what's happening behind the scenes"
- Open browser DevTools Network tab
- Ask another question
- Show:
- API request to
/api/query - Request payload with question
- Response with answer and sources
- API request to
- "What are the main features?"
- "How do I install this?"
- "What are the system requirements?"
- "What is the main finding?"
- "What methodology was used?"
- "What are the conclusions?"
- "What problem does this solve?"
- "Who is the target audience?"
- "What are the key benefits?"
- Upload 2-3 related documents
- Ask a question that requires information from multiple sources
- Show how sources from different documents are retrieved
- Ask a question
- Show the answer
- Open the sources panel
- Read the actual text from the source
- Verify the answer matches the source
- Ask a question completely unrelated to the documents
- Show how the system responds with "I don't know"
- Explain: "This prevents hallucinations"
Setup: Upload product documentation Demo:
- "How can I reset my password?"
- "What are the warranty terms?"
- Show how support teams can use this
Setup: Upload research papers Demo:
- "What were the key findings?"
- "Compare the methodologies"
- Show how researchers can quickly extract information
Setup: Upload company policies Demo:
- "What is the vacation policy?"
- "How do I submit expenses?"
- Show employee self-service use case
- "The system is processing thousands of vectors"
- "In production, this can be optimized with caching"
- "The quality depends on document quality and chunk size"
- "This can be tuned for specific use cases"
- Keep a backup video/screenshots ready
- Explain the architecture while resolving
Expected Questions:
Q: How accurate is it? A: Depends on document quality and relevance. We use semantic search to find the most relevant chunks, and GPT-3.5 generates answers based only on those sources.
Q: Can it work with other languages? A: Yes! Both the embedding model and GPT support multiple languages. You'd just need to adjust the models.
Q: What about data privacy? A: Documents are stored locally in ChromaDB. For production, you can use self-hosted models or ensure compliance with data policies.
Q: How much does it cost? A: Main costs are Google Gemini API calls (input + output tokens). You can use local models to eliminate API costs.
Q: Can it scale? A: Yes! ChromaDB can be deployed in client-server mode, and the backend can be scaled horizontally.
Q: What file types are supported? A: Currently PDF and TXT. Easy to add more with LangChain's document loaders (DOCX, HTML, CSV, etc.).
- Practice beforehand: Know your sample documents well
- Clear data between demos: Use the "Clear All Documents" feature
- Prepare for slow responses: Have talking points ready
- Show the code: Briefly show the clean, readable codebase
- Emphasize practical use cases: Connect features to real-world problems
If creating a video demo:
- Set up screen recording at 1080p
- Close unnecessary tabs and applications
- Zoom in on text when showing code or sources
- Use a script but sound natural
- Add captions for key features
- Keep it under 5 minutes for attention span
Before starting:
- Backend running and healthy
- Frontend running on localhost:3000
- Sample documents prepared
- Browser cache cleared
- DevTools closed (or ready to open)
- Tested upload and query flow
- Notes/script ready
Good luck with your demo! 🎬