This laboratory course teaches fundamental concepts in computational science and machine learning based on matrix factorization.
where a data matrix X is (approximately) factorized into two matrices A and B. Based on the choice of approximation quality measure and the constraints on A and B, the method provides a powerful framework of numerical linear algebra that encompasses many important techniques, such as dimension reduction, clustering and sparse coding.
|20.02||Remember to bring a laptop with Matlab running to the tutorial sessions!|
|10.02||Website is now running.|
|8||Introduction to the course||lecture01.pdf||
|9||Principal Component Analysis||lecture02.pdf||
|10||Singular Value Decomposition||lecture03.pdf||
Some of the material is password protected for copyright reasons. Please send an email to CILAB to obtain it. Note that we can only provide access to ETH students.
|Lectures||Tue 08 - 10||HG E 5|
|Exercises||Thu 16 - 18||CAB G 61|
|Fri 08 - 10||CAB G 11|
|Presence hour||Mo 11-12||CAB H 53|
Each exercise session will provide you with a pen-and-paper problem and discussion of the solution in the session. These problems help solidify theory presented in the lecture and identify areas of lack of understanding.
Assignments are larger problems, that you work on in groups of two or three students. For each assignment, you develop and implement an algorithm that solves one of the four application problems. Submitting your Matlab code provides you with feedback in terms of correctness, accuracy, speed and efficiency.
Based on the implementations you developed during the semester, you create a novel solution for one of the application problems, by extending or combining previous work. You write up your methodology and an experimental comparison to baselines in the form of a short scientific paper. You submit your novel solution to the online ranking system for competitive comparison.
Projects are due on Friday 21 June.
The mode of examination is written, 120 minutes length. The language of examination is English. As written aids, you can bring two A4 pages (i.e. one A4 sheet of paper), either handwritten or 11 point minimum font size.
You need to satisfy two requirements to pass this course:
Therefore, your final grade will be:
This lab course has a strong focus on practical assignments. Students work in groups to develop solutions to four application problems. To learn more about the applications, click here.
For solving assignments, you...
You can find more details on the Submission System pages.
If you are experiencing problems with the submission system, then consult the instructions. The same page also gives details on how to report problems with the submission system.
Your semester project is a group effort. It consists of four parts:
If you don't belong to any group so far, please write to Gabriel Krummenacher by 25 May 2013.
As your final programming assignment, you develop a novel solution to one of the four application problems. You are free to exploit any idea you have, provided it is not identical to any other group submission or existing Matlab implementation of an algorithm on the internet.
Two examples for developing a novel solution:
You compare your novel algorithm to at least two baseline algorithms. For the baselines, you can use the implementations you developed as part of the programming assignments.
You submit your novel algorithm to the online ranking system.
The document should be a maximum of 4 pages.
There are two different types of grading criteria applied to your project, with the corresponding weights shown in brackets.
The following criteria are scored based on your rank in the submission system tables in comparison with the rest of the class. Ranking is performed only over the projects, no assignments are included. Only the very last submission will count.
The ranks will then be converted on a linear scale into a grade between 4 and 6.
The following criteria are graded based on an evaluation by the teaching assistants.
We are grateful to Microsoft Research for providing us with their Conference Management Toolkit to manage the submission and review of project reports.
To submit your report, please go to https://cmt.research.microsoft.com/CIL2013, register and follow the instructions given there. You can resubmit any number of times until the deadline passes.
For a successful submission please follow these steps:
Here is a list of additional material if you want to read up on a specific topic of the course.
Chapter 1.2 "Probability Theory" in: Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer.
Larry Wasserman (2003). All of Statistics. Springer.
Gene Golub and Charles Van Loan (1996). Matrix Computations. The Johns Hopkins University Press.
Lloyd Trefethen and David Bau (1997). Numerical Linear Algebra. SIAM.
Dave Austin. We recommend a Singular Value Decomposition. (SVD tutorial)
Michael Mahoney. Randomized algorithms for matrices and data. (Recent review paper)
Yehuda Koren, Robert Bell and Chris Volinsky (2009). Matrix Factorization Techniques for Recommender Systems. IEEE Computer.
Chapter 9 "Mixture Models and EM" in: Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer.
Mario Frank, Andreas Streich and Joachim M. Buhmann (2012). Multi-Assignment Clustering for Boolean Data. JMLR.
Chapter 1 "Sparse Representations" in: Stephane Mallat (2009). A Wavelet Tour of Signal Processing - The Sparse Way. Academic Press.
Chapter 6 "Wavelet Transforms", pp. 244-276; in: Andrew S. Glassner (1995). Principles of Digital Image Synthesis, Vol. 1. Morgan Kaufmann Publishers, inc.
Chapter 13 "Fourie and Spectral Application", pp. 699-716; in: William H. Press, Saul A. Teukolsky, William T. Vetterling and Brian P. Flannery (2007). Numerical Recipes. The Art of Scientific Computing. Cambridge University Press.
Aharon, Elad and Bruckstein (2005). K-SVD: Design of Dictionaries for Sparse Representation. Proceedings of SPARS.
Richard Baraniuk (2007). Compressive sensing. IEEE Signal Processing Magazine.
We maintain a forum at the VIS board which we regularly attend. Please post questions there, so others can see them and share in the discussion.
If you have questions which are not of general interest, please don't hesitate to contact us directly.
The main email point of contact for the course is CILAB (David Balduzzi).