Milestone 3: RAG Pipeline

Milestone 3: RAG Pipeline#

1. Deliverable Description#

The goal of this milestone is to design, implement, and present your Retrieval-Augmented Generation (RAG) pipeline. Your team should demonstrate how you integrated retrieval and generation components, how your system handles user queries, and how you evaluated and refined your pipeline.

There are three deliverables:

Group Presentation – Showcase your RAG pipeline (due Oct 30)
Technical Report – Around 500–800 words (2–3 pages) (due Nov 1)
Peer Reviews – Review other groups’ RAG presentations or reports (due Nov 1)

2. Presentation Guidelines#

Time: Maximum 8 minutes + 2 minutes for Q&A
Slides: Maximum 6 slides
Participation: All team members must contribute

Content:

Present your RAG pipeline diagram (data flow, retriever, generator)
Explain the query transformation techniques you applied
Show examples of prompt templates used in your pipeline
Demonstrate your retrieval and generation process using LangSmith traces or logs
Reflect on challenges and design decisions

3. Suggested 6-Slide Template#

Slide 1: Title & Team#

Slide 2: RAG Pipeline Overview#

High-level workflow diagram:

Document ingestion → Embedding model → Vector store → Retriever → LLM prompt → Response

Indicate tools used (e.g., LangChain, MongoDB Atlas Vector Search, Gemini/OpenAI API)
Highlight any unique design choices

Slide 3: Query Transformation Techniques#

Describe techniques used to improve retrieval (e.g., MultiQuery, RAG-Fusion, Query Decomposition)
Show examples of transformed queries
Explain why your team chose these techniques

Slide 4: Prompt Engineering#

Show your prompt templates and variable placeholders
Explain how you format retrieved context into prompts
Discuss prompt refinement and lessons learned

Slide 5: Example Query & “Under the Hood” View#

Example user query (e.g., “What policies did TRU release on AI ethics in 2024?”)
Show how it is transformed → retrieved → generated
Screenshot from LangSmith trace showing:
- Steps of retrieval
- Prompt construction
- Model response
Interpret what happened and how your RAG pipeline performed

Slide 6: Reflection & Next Steps#

Key insights and takeaways
Challenges (e.g., noisy retrievals, hallucinations, latency)
Planned improvements (e.g., rerankers, prompt optimization, hybrid retrieval)

4. Technical Report Components#

4.1. Introduction#

Briefly describe your project’s goal and how RAG fits into it
Define the purpose of your pipeline (e.g., Q&A, summarization, recommendation)

4.2. Pipeline Architecture#

Include a system diagram and explain each component:

Data ingestion
Embedding generation
Vector storage (MongoDB Atlas)
Retriever configuration
LLM generation
Mention models and libraries used

4.3. Query Transformation#

Detail the transformation techniques you implemented
Include sample code or screenshots showing before–after queries
Justify your design decisions

4.4. Prompt Engineering#

Provide your base prompt(s) and explain how they evolved
Discuss context window management (e.g., truncation, top-k docs)

4.5. Example Walkthrough (with LangSmith)#

Choose one representative query
Show the trace or log of its processing steps:
User query
Transformed query
Retrieved documents (with metadata)
Prompt sent to LLM
Model output
Comment on performance, accuracy, and relevance

4.6. Reflection#

What worked well in your pipeline?
What limitations did you observe?
How could it be improved (e.g., reranker, metadata filtering, evaluation metrics)?

4.7. References#

Cite any frameworks, tutorials, or models used

4.8. Format#

Around 500–800 words Write in Markdown or Jupyter Notebook format (.md or .ipynb) Include diagrams, screenshots, and traces where appropriate

5. Peer Review Activity#

Each student will review 2 RAG presentations or reports. Provide half a page of feedback for each using GitHub Issues. Peer review accounts for 5% of your grade.

Submission Steps:#

Visit the group’s GitHub repository.
Create a GitHub Issue using the peer review template.
Add your comments and submit.