Individual Project Rulebook
Your journey to mastery through data science
INDIVIDUAL PROJECT RULEBOOK
Advanced Programming ยท Fall 2025 ยท HEC Lausanne
Your Final Project = 100% of Your Grade Build a machine learning or data analysis project that demonstrates your programming and analytical skills.
๐ฏ Project Philosophy
This is your opportunity to apply what youโve learned to a real problem. Most projects will focus on machine learning and data analysis - comparing models, analyzing datasets, and presenting findings.
Core Principles:
- Working Code First: If we canโt run
python main.py, we canโt grade it - Reproducibility: Same code โ same results, every time
- Clear Documentation: README explains setup, REPORT explains findings
- Clean Structure: Organized code that others can understand
๐ Project Focus: Machine Learning
Most projects should follow this pattern:
Research Question โ Data โ Models โ Evaluation โ Report
Example: โWhich model best predicts customer churn: Random Forest, Logistic Regression, or XGBoost?โ
Recommended Project Types
- Classification: Predict categories (spam detection, sentiment analysis, disease diagnosis)
- Regression: Predict values (house prices, sales forecasting, stock returns)
- Clustering: Find patterns (customer segmentation, anomaly detection)
- Comparison Studies: Compare 3+ models on the same task
Custom Projects
Want to do something different? As long as you can demonstrate:
- Data manipulation with pandas/NumPy
- Some form of analysis or modeling
- Clear methodology and results
- Reproducible code
โฆthen propose it!
๐ฆ Deliverables
1. Project Proposal (Due: November 3)
- Length: 300-500 words
- Format:
PROPOSAL.mdin your repository - Content: What youโll build, what data youโll use, what question youโll answer
2. GitHub Repository (Due: January 11)
Required Structure:
your-project/
โโโ README.md # Setup and usage instructions
โโโ PROPOSAL.md # Your project proposal
โโโ environment.yml # Conda dependencies
โโโ requirements.txt # Pip dependencies (alternative)
โโโ main.py # Entry point - THIS MUST WORK
โโโ src/ # Source code modules
โ โโโ __init__.py
โ โโโ data_loader.py # Data loading and preprocessing
โ โโโ models.py # Model definitions
โ โโโ evaluation.py # Evaluation and visualization
โโโ data/
โ โโโ raw/ # Original data (or instructions to download)
โโโ results/ # Output figures and metrics
โโโ notebooks/ # Jupyter notebooks (optional, for exploration)
Key Requirements:
python main.pymust run without errors- All dependencies listed in
environment.ymlorrequirements.txt - Data included or clear download instructions
- Use
random_statefor reproducibility
3. Technical Report (Due: January 11)
- Length: ~10 pages (min 8 pages excluding references)
- Format: PDF (use provided Markdown or LaTeX template)
- Structure (8 required sections, aligned with professorโs requirements):
- Abstract: Summary of project (~200 words)
- Introduction: Research question and motivation
- Research Question & Literature: Context, related work, why it matters
- Methodology: Data, algorithms, models, evaluation approach
- Implementation: Key technical decisions, challenges solved (can be brief)
- Codebase & Reproducibility: How to run it, dependencies (can be brief - 2-3 sentences)
- Results: Tables, figures, findings, interpretation
- Conclusion: Summary, limitations, future work
- Appendix: AI tools used (ChatGPT, Copilot, etc.) - required if applicable
Download templates from the course website:
- Markdown template โ PDF with pandoc
- LaTeX template โ PDF with pdflatex
4. Presentation Video (Due: January 11, Required)
- Format: Recorded video (MP4, YouTube link, or Loom)
- Duration: 10-15 minutes (not longer than 15!)
- Content:
- Problem motivation (2-3 min)
- Technical approach and methodology (3-4 min)
- Demo of your code running (5-6 min)
- Results and learnings (2-3 min)
Optional: Live presentation (TBA) instead of video, with Q&A
๐ง Technical Requirements
Must Have
| Component | Requirement |
|---|---|
| Language | Python 3.10+ |
| Entry Point | python main.py runs successfully |
| Dependencies | Listed in environment.yml or requirements.txt |
| Reproducibility | random_state set everywhere |
| Documentation | README with setup instructions |
| Report | ~10 page PDF with methodology and results |
Nice to Have (But Not Required)
- Unit tests
- CI/CD pipeline
- Docker containerization
- Type hints
-
70% test coverage
Focus on making your project work well, not on adding features you donโt need.
Code Quality
- Formatting: Use consistent style (black/autopep8 recommended)
- Naming: Clear, descriptive variable and function names
- Structure: Modular code in
src/directory - Comments: Explain why, not what
๐ Timeline
| Date | Milestone |
|---|---|
| Nov 3 | Project Proposal Due |
| Nov 10 | Proposal Feedback |
| Nov 24 | Feature Freeze - stop adding, start polishing |
| Dec 8 | Documentation Week - finalize README and report |
| Jan 11 | Final Submission |
| TBA | Presentations (optional) |
Development Strategy
- Week 1-2: Set up structure, get data, basic pipeline working
- Week 3-4: Implement models, run experiments
- Week 5-6: Analyze results, create visualizations
- Week 7: Write report, polish documentation
๐ What Weโre Looking For
The #1 Grading Criterion
Does
python main.pywork on our machines?
If we canโt run your code, we canโt grade it. Test on a fresh environment before submitting!
Evaluation Focus
- Working Code: Runs without errors, produces output
- Clear Methodology: We understand what you did and why
- Meaningful Results: You actually answered your research question
- Professional Presentation: Clean code, clear report, good documentation
- Reproducibility: We get the same results when we run it
๐ค AI Tools Policy
We encourage using AI tools (ChatGPT, Copilot, Claude) as learning aids.
โ Good uses:
- Debugging help
- Learning new libraries
- Code review suggestions
- Documentation writing
โ Not acceptable:
- Submitting code you donโt understand
- Having AI write your entire project
Required: Create AI_USAGE.md documenting how you used AI tools.
โ Common Questions
Q: Can I use a dataset from Kaggle? A: Yes! Just cite your source and include download instructions.
Q: Do I need to write tests? A: Not required, but encouraged if you have time.
Q: What if my model doesnโt perform well? A: Thatโs fine! Explain why in your discussion section. Negative results are still results.
Q: Can I work with a partner? A: No, all projects are individual work.
Q: What if I canโt finish everything?
A: Focus on having a working main.py and a complete report. A smaller project done well beats an ambitious project that doesnโt run.
๐ Getting Started
- Fork the example project from the course repository
- Replace the Iris example with your own data and research question
- Follow the same structure - it works!
- Test on a fresh environment before submitting
See the examples/iris-classification/ folder in the course repository for a complete working example.
๐ฌ How to Submit
Email your submission to all TAs by January 11, 23:59:
- anna.smirnova@unil.ch
- francesco.brunamonti@unil.ch
- zhongshan.chen@unil.ch
Subject line: [AP2025] Final Project - Your Name
Email body:
Name:
Student ID:
Project Title:
GitHub Repository: [link]
Video/Presentation: [link - YouTube, Loom, or Google Drive]
Checklist:
- [ ] python main.py runs without errors
- [ ] requirements.txt or environment.yml included
- [ ] Report (~10 pages PDF) attached OR in repository
- [ ] Video is 10-15 min OR signed up for live presentation (Dec 15)
Note: Report can be attached to email OR included in your repository as report.pdf - both work!
๐ค Live Presentation Option
Present live on December 15 instead of recording a video!
- Benefit: No video submission required
- Sign up by: December 8
- Not graded: Just an opportunity to practice presenting
Sign up link: [TBA]
๐ง Need Help?
- Discord: Quick questions and peer help
- Office Hours: In-depth project discussions
- Email: Project-specific issues
We want you to succeed! Start early, ask questions, and make something youโre proud of.
| *Last updated: December 2025 | Deadline: January 11, 2026* |