Weekly Materials
Lecture slides, week 12
Professor's lecture slides (PDF)
Discussion of the previous exercises; Finger exercises (with TA)
Hands-on tutorials and practice exercises
→
Exercise 8
Distribution of Exercise sheet 8
→
Code Examples
References & Resources
Finger Exercises
Discussion of the previous exercises; Finger exercises (with TA)
Additional Notes
Week 12: Clustering
Learning Objectives
- Understand unsupervised learning and clustering fundamentals
- Master k-means clustering algorithm and its variants
- Learn Gaussian Mixture Models (GMM) and Expectation Maximization
- Explore hierarchical clustering methods
- Apply density-based clustering techniques like DBSCAN
- Compare different clustering algorithms and choose appropriate methods
Topics Covered
- k-Means Clustering: Centroid-based clustering algorithm
- Gaussian Mixture Models (GMM): Probabilistic clustering approach
- Expectation Maximization: Algorithm for parameter estimation in GMM
- Hierarchical Clustering: Agglomerative and divisive methods
- Density-Based Clustering: DBSCAN and related algorithms
- Cluster Validation: Metrics and techniques for evaluating clusters
Schedule
- Lecture: Monday, December 1, 2025 (10:15 - 12:00)
- Practice Session: Monday, December 1, 2025 (16:30 - 18:00)
- TA Session: Discussion of exercises and clustering implementations
Key Concepts
- Unsupervised Learning: Learning patterns without labeled data
- Distance Metrics: Euclidean, Manhattan, cosine similarity
- Cluster Quality: Intra-cluster vs inter-cluster distances
- Initialization Methods: K-means++, random initialization
- Model Selection: Choosing the number of clusters
- Convergence: When and why algorithms converge
Clustering Algorithms
- k-Means: Lloyd’s algorithm, k-means++, mini-batch k-means
- GMM: Multivariate Gaussian distributions, soft clustering
- Hierarchical: Ward linkage, complete linkage, single linkage
- DBSCAN: Density-based spatial clustering, noise detection
- Mean Shift: Mode-seeking algorithm for clustering
Practical Applications
- Customer Segmentation: Marketing and business applications
- Image Segmentation: Computer vision clustering tasks
- Gene Expression: Bioinformatics clustering analysis
- Market Segmentation: Financial and economic clustering
Evaluation Metrics
- Internal Metrics: Silhouette score, Davies-Bouldin index
- External Metrics: Adjusted Rand index, normalized mutual information
- Visual Assessment: Scatter plots, dendrograms, cluster visualization
Assignments
- Exercise 8: Distributed this week - Clustering algorithm implementations
- Compare different clustering methods on various datasets
- Implement k-means and GMM from scratch
- Apply clustering to real-world datasets
Tools and Implementation
- scikit-learn for standard clustering algorithms
- Custom implementations for educational purposes
- Visualization tools for cluster analysis
- Performance comparison frameworks