Overview
Machine learning is poised to change how people design, operate, and analyze computer systems. This course introduces the emerging area of learning-based systems, with the goal to provide hands-on experience in applying learning to system design and to prepare students for research in this field. Topics include automatic optimization of system parameters, learning-enhanced data structures and algorithms (e.g., indexes, sketches, compression, caching, scheduling), core techniques (e.g., reinforcement learning, bandit algorithms, deep learning) and their applications to systems and networking. The course will include lectures, invited talks by experts, a semester-long project and paper, and hands-on labs designed to give experience with topics covered.
Enrollment is limited. Classes are hybrid, but we strongly encourage you to attend in-person.
The course web site is http://dsg.csail.mit.edu/6.887
This course is directed towards master/PhD students and undergraduate students in their last year.
Course Structure
Lectures are held twice a week, from 2:30-4:00 in 32-124 on Mondays and Wednesdays. Most lectures will feature an external speaker either from industry or academia, who are an expert in the discussed topic. Some of the external speakers will be virtual. However, we will always offer a live-stream to 32-124 and the lecturers will be there in-person.
For each lecture we will read 1-2 papers, and discuss them in class. You will need to write a short review of the required reading by noon the day of each class. You are also expected to answer questions and participate in discussions. This class only works if you come prepared to discuss the papers in detail, which is why 15% of your grade is for in-class participation.
In addition, there will be a semester-long project and about 4 labs.
Topics Covered
Potential topics include:
Fundamentals/Background- Systems
- Storage systems / Key-value stores (Log-Structure Merge Trees, Concurrency, …)
- Databases (Query Optimization, Query Execution, Cardinality estimation, Cost Models, Storage Design,...)
- Network design (Congestion control, Routing, Caching, Applications (e.g., video), …)
- Cloud Computing ( SaaS/ IaaS, Resource management, Scheduling, Fairness, …)
- Algorithms
- Sketches (Count-min, Hyper log log, …)
- Caching (LRU, …)
- Index structures (B+-Trees, Skip lists, ART, Bloom filters, Range filters, …)
- Multi-dimensional indexes (KD-Tree, R-Tree, Z-order encoding, …)
- Scheduling
- Other selected algorithms: sorting, join algorithms
- ML/Optimization
- Regression
- Mixture of Experts
- Deep Learning
- Bandit Algorithms
- Reinforcement Learning
- Offline Policy Optimization
- Bayesian Optimization
- Generative Adversarial Networks
- Auto-tuning of parameters
- Learning-enhanced sketches
- Learned Indexes
- Learned Bloom Filters
- Learned Hashing
- Learned Query Optimization
- Learned Scheduling
- Learned Multi-dimensional and storage layouts
- Learned Sorting
- Learned Compression
- Learned Caching
- Learned Congestion Control
- Learned Video Compression
Prerequisites
- Programming: 6.009 (Fundamentals of Programming)
- Data Structures: 6.006 (Introduction to Algorithms) or equivalent
- Machine Learning: 6.008 (Introduction to Inference), 6.036 (Introduction to Machine Learning), 6.034 (Artificial Intelligence) or equivalent
- Systems: 6.033 or equivalent
Collaboration Policy
For labs, you are allowed to discuss your answers with other students, but please write up your own answers and list your collaborators. Copying solutions from other students is never allowed. For the group project you will work in teams and hand in only one written report. Note that we will use software to detect copying of lab and homework assignments.Units
3-0-9.Grading
Grades are assigned based on labs, class participation, and final project, and class participation. The grading breakdown is as follows:
Final Project: 40%
Labs: 40%
Participation: 15%
Paper summaries: 5%
Each student is allowed 5 "late days", each of which may be used to turn in one lab one day (24 hour period) later than it is due without penalty. After all five late days are used, assignments will be docked one letter grade for each day they are late. If you have a note from Student Support Services, please contact the instructors. For all other circumstances (interview trips, sporting events, performances, overwork, etc.) you may use your extensions. If these days are not enough, please contact the instructors.
Late days may not be used for the final project.