Milestone 1: Dataset and proposal#
1. Deliverable Description#
The goal of this milestone is to define your project goals and direction. There are three deliverables:
Group Presentation – Introduce your dataset and project plan (due Sep 18)
Written Proposal – Detailed plan of around 1200 words (due Sep 19)
Peer Reviews – Review other groups’ proposals individually (due Sep 23)
2. Presentation Guidelines#
Time: Maximum 8 minutes + 2 minutes for Q&A
Slides: Maximum 5 slides
Participation: All team members must contribute, either in presenting or answering questions
Content:
Introduce and describe your dataset
Summarize your project plan as outlined in your written proposal
3. Suggested 5-Slide Template#
Slide 1: Title & Team#
Project title
Team member names
Course name & date
Optional: a visual/logo or thematic image for your dataset
Slide 2: Dataset Overview#
Name and source of the dataset
Size (# of records, features, etc.)
Type of data (structured, unstructured, nested, etc.)
Why this dataset is interesting or relevant for your project
Slide 3: Problem / Research Question#
The main problem your project will address
Key questions you aim to answer
Optional: any hypotheses you have
Slide 4: Proposed Approach#
Brief description of the methods/tools you will use (e.g., MongoDB, PySpark, RAG, ML models)
Steps or workflow of your analysis/project
Expected challenges and how you might address them
Slide 5: Expected Outcomes & Impact#
Expected results or insights from the project
Potential applications or implications
How success will be measured
Tips for Presentation#
Assign 1–2 slides per team member to present, so everyone participates
Keep text minimal—use visuals, charts, or diagrams where possible
Practice timing: ~1.5 minutes per slide to stay within 8 minutes
3. Proposal Components#
Your proposal should contain the following sections, and should be reported in this order.
3.1. Introduction#
The introduction should start broad, introducing the question being asked/problem needing solving and why it is important. Any relevant information to understand the question/problem and its importance should be included. The proposal should communicate the question/problem in your own words. This is important to show your professor and classmates that you understand the question/problem.
Next, you should refine the big-picture problem into tangible objectives that are directly addressable by data science techniques.
Finally, describe the final data product to be delivered to the partner. Example components of this product might include (but are not limited to) one or more of the following:
A data pipeline;
Documentation;
A dashboard
Etc.
3.2. Data Science Techniques#
Describe how you will use data science techniques in the project. Be sure to discuss the appropriateness of the data for the proposed data science techniques, as well as difficulties the data might pose.
You should include a description of the data (variables/features and observational units) and some examples/snippets of what the data looks like (as a table or a visualization).
Explain how your project utilizes the concepts, or tools of other ADSC courses in this term.
Include clear success criteria for the project. This includes which evaluation metrics you are going to use and why.
3.3. Timeline#
Indicate a rough timeline of the project, including the milestones you hope to achieve.
3.4 References#
A list of references with a citation style of your choice.
3.5 Format#
Around 1200 words
Write the proposal in Markdown or JupyterNotebook format and save it as a
.mdor.ipynbfile in your group’s GitHub repository.Use clear headings, bullet points, and formatting to enhance readability.
4. Peer-review activity#
Each student will be assigned 2 proposals to review.
For each proposal, provide half a page of feedback using the group’s GitHub Issues.
Peer review accounts for 5% of your grade.
This activity is to be completed individually.
Your feedback will be graded based on:
Preparednesss: The reviewer shows a deep understanding of the proposal, reads it carefully, and shows critical thinking.
Constructiveness: The suggestions provided are highly constructive, actionable, and framed in a positive manner.
Professionalism: Feedback is professional, respectful, and written in a supportive tone.
SUBMISSION INSTRUCTION:
Navigate to the repository of the group you are reviewing.
Create a GitHub Issue using the peer feedback template.
Fill out your feedback in the issue.
Submit the issue.
