Individual Project Rulebook
Your journey to mastery through data science
INDIVIDUAL PROJECT RULEBOOK
Advanced Programming Β· Fall 2025 Β· HEC Lausanne
Your Final Project = 100% of Your Grade Build a machine learning or data analysis project that demonstrates your programming and analytical skills.
π― Project Philosophy
This is your opportunity to apply what youβve learned to a real problem. Most projects will focus on machine learning and data analysis - comparing models, analyzing datasets, and presenting findings.
Core Principles:
- Working Code First: If we canβt run
python main.py, we canβt grade it - Reproducibility: Same code β same results, every time
- Clear Documentation: README explains setup, REPORT explains findings
- Clean Structure: Organized code that others can understand
π Project Focus: Machine Learning
Most projects should follow this pattern:
Research Question β Data β Models β Evaluation β Report
Example: βWhich model best predicts customer churn: Random Forest, Logistic Regression, or XGBoost?β
Recommended Project Types
- Classification: Predict categories (spam detection, sentiment analysis, disease diagnosis)
- Regression: Predict values (house prices, sales forecasting, stock returns)
- Clustering: Find patterns (customer segmentation, anomaly detection)
- Comparison Studies: Compare 3+ models on the same task
Custom Projects
Want to do something different? As long as you can demonstrate:
- Data manipulation with pandas/NumPy
- Some form of analysis or modeling
- Clear methodology and results
- Reproducible code
β¦then propose it!
π¦ Deliverables
1. Project Proposal (Due: November 3)
- Length: 300-500 words
- Format:
PROPOSAL.mdin your repository - Content: What youβll build, what data youβll use, what question youβll answer
2. GitHub Repository (Due: December 21)
Required Structure:
your-project/
βββ README.md # Setup and usage instructions
βββ PROPOSAL.md # Your project proposal
βββ environment.yml # Conda dependencies
βββ requirements.txt # Pip dependencies (alternative)
βββ main.py # Entry point - THIS MUST WORK
βββ src/ # Source code modules
β βββ __init__.py
β βββ data_loader.py # Data loading and preprocessing
β βββ models.py # Model definitions
β βββ evaluation.py # Evaluation and visualization
βββ data/
β βββ raw/ # Original data (or instructions to download)
βββ results/ # Output figures and metrics
βββ notebooks/ # Jupyter notebooks (optional, for exploration)
Key Requirements:
python main.pymust run without errors- All dependencies listed in
environment.ymlorrequirements.txt - Data included or clear download instructions
- Use
random_statefor reproducibility
3. Technical Report (Due: December 21)
- Length: ~10 pages (excluding code appendix)
- Format: PDF (use provided Markdown or LaTeX template)
- Structure:
- Abstract (200 words): Summary of project
- Introduction: Research question and motivation
- Literature Review: Brief context and related work
- Methodology: Data, preprocessing, models, evaluation metrics
- Results: Tables, figures, findings
- Discussion: Interpretation, limitations
- Conclusion: Summary and future work
- References: Cited sources
Download templates from the course website:
- Markdown template β PDF with pandoc
- LaTeX template β PDF with pdflatex
4. Presentation (December 16-20, Optional)
- Duration: 8-10 minutes + Q&A
- Content: Problem, approach, demo, results, learnings
π§ Technical Requirements
Must Have
| Component | Requirement |
|---|---|
| Language | Python 3.10+ |
| Entry Point | python main.py runs successfully |
| Dependencies | Listed in environment.yml or requirements.txt |
| Reproducibility | random_state set everywhere |
| Documentation | README with setup instructions |
| Report | ~10 page PDF with methodology and results |
Nice to Have (But Not Required)
- Unit tests
- CI/CD pipeline
- Docker containerization
- Type hints
-
70% test coverage
Focus on making your project work well, not on adding features you donβt need.
Code Quality
- Formatting: Use consistent style (black/autopep8 recommended)
- Naming: Clear, descriptive variable and function names
- Structure: Modular code in
src/directory - Comments: Explain why, not what
π Timeline
| Date | Milestone |
|---|---|
| Nov 3 | Project Proposal Due |
| Nov 10 | Proposal Feedback |
| Nov 24 | Feature Freeze - stop adding, start polishing |
| Dec 8 | Documentation Week - finalize README and report |
| Dec 21 | Final Submission |
| Dec 16-20 | Presentations (optional) |
Development Strategy
- Week 1-2: Set up structure, get data, basic pipeline working
- Week 3-4: Implement models, run experiments
- Week 5-6: Analyze results, create visualizations
- Week 7: Write report, polish documentation
π What Weβre Looking For
The #1 Grading Criterion
Does
python main.pywork on our machines?
If we canβt run your code, we canβt grade it. Test on a fresh environment before submitting!
Evaluation Focus
- Working Code: Runs without errors, produces output
- Clear Methodology: We understand what you did and why
- Meaningful Results: You actually answered your research question
- Professional Presentation: Clean code, clear report, good documentation
- Reproducibility: We get the same results when we run it
π€ AI Tools Policy
We encourage using AI tools (ChatGPT, Copilot, Claude) as learning aids.
β Good uses:
- Debugging help
- Learning new libraries
- Code review suggestions
- Documentation writing
β Not acceptable:
- Submitting code you donβt understand
- Having AI write your entire project
Required: Create AI_USAGE.md documenting how you used AI tools.
β Common Questions
Q: Can I use a dataset from Kaggle? A: Yes! Just cite your source and include download instructions.
Q: Do I need to write tests? A: Not required, but encouraged if you have time.
Q: What if my model doesnβt perform well? A: Thatβs fine! Explain why in your discussion section. Negative results are still results.
Q: Can I work with a partner? A: No, all projects are individual work.
Q: What if I canβt finish everything?
A: Focus on having a working main.py and a complete report. A smaller project done well beats an ambitious project that doesnβt run.
π Getting Started
- Fork the example project from the course repository
- Replace the Iris example with your own data and research question
- Follow the same structure - it works!
- Test on a fresh environment before submitting
See the examples/iris-classification/ folder in the course repository for a complete working example.
π§ Need Help?
- Discord: Quick questions and peer help
- Office Hours: In-depth project discussions
- Email: Project-specific issues
We want you to succeed! Start early, ask questions, and make something youβre proud of.
| *Last updated: November 2025 | Deadline: December 21, 2025* |