Capstone Project

Advanced Data Analytics • Individual Projects • 100% of Final Grade

2025

Capstone Project

Individual ML project • 8-10 page report • Source code & dataset • 15-min video

Download Requirements PDF →

Key Information

👤

Individual Work

All projects are solo

📅

Important Dates

Proposal: Nov 10 • Final: Dec 21

🎯

Grading

100% based on project

📦

Deliverables

Report • Source Code • Dataset • Video

📝

Proposal

200 words by Nov 10

⏱️

Video Length

15 minutes max

Report Requirements

Your report must be 8-10 pages (excluding references) in SIAM conference style:

⚠️ Important Format Requirements
  • 8-10 pages (minimum 8, excluding references)
  • SIAM conference style (required format)
  • Submit via email: Report (PDF) + Code + Dataset + Video
  • Include all 8 mandatory sections
  • Due December 21, 2025 at 23:59

Project Scope & Guidelines

📌 Project Requirements

Your project must be a classical machine learning project using real-world data (not pre-solved Kaggle competitions or overused tutorial datasets like Titanic/MNIST). Apply techniques learned in the course: regression, classification, clustering, or deep learning with proper methodology including data cleaning, exploratory analysis, model selection, and evaluation.

⚠️ Academic Integrity

Plagiarism will result in immediate failure. All projects are individual - no code sharing between students. You must cite all external code sources, list AI tools used (ChatGPT, Copilot) in the appendix, properly reference your dataset origin, and write the report in your own words.

💡 What Makes a Good Project

Choose a problem with real impact: predict housing prices with feature engineering, classify customer churn, analyze financial sentiment, forecast energy consumption, or tackle a domain-specific classification problem. Your project should demonstrate both technical skill and practical insight.

❌ What to Avoid

Do not submit projects using overused datasets (Titanic, Iris, MNIST without innovation), Kaggle competitions with public solutions, purely theoretical work without implementation, or direct replications of course exercises. Projects must go beyond basic statistics to include genuine machine learning techniques.

Project Structure

All projects are individual. You must propose and carry out a data science project that demonstrates mastery of course content.

Mandatory Report Sections

📊

Research Paper Structure

8 Required Sections

Your report must contain these 8 mandatory sections in SIAM conference style:

Required Sections
  1. Abstract: Summary of the project and findings
  2. Introduction: Context and motivation
  3. Research Question & Literature: Problem statement and relevant work
  4. Methodology: ML/statistical methods applied
  5. Data Description: Source, cleaning, preprocessing (dataset must be submitted)
  6. Implementation: Code discussion and architecture
  7. Results: Findings and analysis
  8. Conclusion: Summary and future work
  9. Appendix: List of helper tools (ChatGPT, Copilot, etc.)

Important Deadlines & Submission

⚠️ DEADLINE

Proposal Submission

Monday, November 10, 2025

Submit a 200-word proposal describing your project idea to the TAs for approval.

Requirements: Brief description of data source, methodology, and expected outcomes
⚠️ DEADLINE

Final Submission

Sunday, December 21, 2025 at 23:59

Submit all deliverables via email to TAs. No late submissions accepted!

Required Deliverables:
  • • PDF report (8-10 pages in SIAM style)
  • • Source code (all Python files)
  • • Dataset used for analysis
  • • 15-minute video presentation
✓ POLICY

No Significant Results Policy

Full marks still possible

Projects can receive full marks even if hypothesis fails, as long as the story is well-told and methodology is sound.

Note: We encourage scholarly behavior over rigged data