Week 11: Final Project Workshop

Common Mistakes (Continued)

No clear entry point:

# Which file do we run?
analysis.py
test.py
final_version_v2_FINAL.py

Clear main.py:

# main.py - Always run this
if __name__ == "__main__":
    main()

No report:

# Only code, no documentation of findings

Complete report:

# project_report.pdf with methodology, results, discussion
# Use the provided template (Markdown or LaTeX)

Week 11: Final Project Workshop

Today's Journey

Why This Matters

Part 1: Research Question → Full Repository

Example Research Question

What Files Do We Need?

Step 1: Set Up Environment (Conda on Nuvolos)

Your environment.yml File

Optional: Local Python (venv/uv)

Step 2: Create Project Structure

Step 3: Write src/data_loader.py

Step 4: Write src/models.py (1/2)

Step 4: Write src/models.py (2/2)

Step 5: Write src/evaluation.py

Step 6: Write main.py (1/2)

Step 6: Write main.py (2/2)

Step 7: Write README.md

README.md (continued)

Test It!

Checkpoint: What We Built

Part 2: Debug Your ML Project

Common ML Bug 1: Shape Mismatches

Common ML Bug 2: Wrong Data Types

Common ML Bug 3: Missing Values (NaN)

Common ML Bug 4: Not Scaling Features

Common ML Bug 5: Data Leakage

Common sklearn Error Messages (1/3)

Common sklearn Error Messages (2/3)

Common sklearn Error Messages (3/3)

Reproducibility Problem

Reproducibility Fix: random_state

Overfitting: What It Looks Like

Overfitting: How to Fix

ML Debugging Checklist

Part 3: Write Your Report

Why Write a Report?

Report Structure & Templates

Converting to PDF

What to Include in Each Section

What to Include (continued)

Report Tips

Where Does the Report Go?

Download the Templates

Part 4: Submission Checklist

The #1 Grading Criterion

Submission Checklist

Test on Fresh Environment

Common Submission Mistakes

Common Mistakes (Continued)

Final Project Structure

Quick Reference: Conda Commands

Quick Reference: Common sklearn Fixes

Resources

Summary

Questions?