CS66: Machine Learning

(Spring 2019)

Course Info | Schedule | Grading
Academic Integrity | Piazza | Links
CS66 Machine Learning

Course Information

Course: MWF 10:30–11:20, Science Center 181
Professor: Sara Mathieson
Office: Science Center 249
Office hours: Monday 12:30-2pm and Friday 1-3pm
Piazza: CS66 Q&A forum

The prerequisite for this course is CS35. Machine Learning as a field has grown considerably over the past few decades. In this course, we will explore both classical and modern approaches, with an emphasis on theoretical understanding. There will be a significant math component (statistics and probability in particular), as well as a substantial implementation component (as opposed to using high-level libraries). However, during the last part of the course we will use a few modern libraries such as TensorFlow and Keras. By the end of this course, you should be able to form a hypothesis about a dataset of interest, use a variety of methods and approaches to test your hypothesis, and be able to interpret the results to form a meaningful conclusion. We will focus on real-world, publicly available datasets, not generating new data.

The language for this course is Python 3.


book pic We will primarily be using the book An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. It is free and available online.

See the Schedule for each week's reading assignment, which will often be supplemented with other material and optional research papers. The schedule is tentative and subject to change throughout the semester.

Schedule (Tentative)


Jan 21


Introduction to Machine Learning

  • Machine learning terminology
  • Notation
  • K-nearest neighbors




Lab 1: K-nearest neighbors

Jan 23


Jan 25


Jan 28


Decision Trees

  • Decision trees






Lab 2: Decision trees

Jan 30

Out of town - NO CLASS

Feb 01

Out of town - NO CLASS

Drop/add ends


Feb 04


Linear Regression

  • Linear regression


  • ISL: Sections 3.1, 3.2, 3.5
  • (optional) ISL: Sections 3.3, 3.4
  • (optional) Simple linear regression by James Kirchner (2001)

Feb 06


Feb 08


Feb 11


Probabilistic Models 1

  • Introduction to probability
  • Logistic regression





Lab 3: Regression

Feb 13


Feb 15


Feb 18


Probabilistic Models 2

  • Naive Bayes





Lab 4: Probabilistic Models

Feb 20


Feb 22


Feb 25


Evaluation Metrics

  • Confusion matrices
  • Precision and recall
  • ROC curves
  • Relationship to probabilistic models
  • Cross-Validation





In-lab Midterm 1

Feb 27


Mar 01


Mar 04


Ensemble Methods

  • Bagging
  • Random forests
  • Boosting





Lab 4: (cont)

Mar 06


Mar 08


Mar 11

Spring Break

Mar 13

Mar 15


Mar 18


Support Vector Machines

  • Perceptron
  • Support vector machines





Lab 5: Ensemble methods

Mar 20


Mar 22


Mar 25


SVMs (continued)

  • Lagrange multipliers
  • SVM optimization problems
  • Kernels


  • (see previous week)




Lab 6: Support vector machines

Mar 27


Mar 29

CR/NC/W Deadline


Apr 01


Topics in Deep Learning 1

  • Introduction to neural networks
  • Fully connected architectures





Lab 6: (cont)

Apr 03


Apr 05


Apr 08


Topics in Deep Learning 2

  • Convolutional neural networks (CNNs)
  • Generative adversarial networks (GANs)





Lab 7: Neural Networks

Apr 10


Apr 12


Apr 15


Unsupervised Learning

  • K-means clustering
  • Gaussian mixture models
  • Hierarchical clustering
  • Dimensionality Reduction
  • Principal components analysis


  • ISL: Sections 10.1-10.3




Project: Proposal
Lab 8: (optional) Unsupervised Learning

Apr 17


Apr 19


Apr 22


Midterm Review and Special Topics

  • Midterm 2 review
  • Guest Lecture by Prof. Matt Zucker




In-lab Midterm 2

Apr 24


Apr 26


Apr 29


Special Topic: Machine Learning and Ethics

  • Deep learning in biology
  • Learning from biased datasets





Project: Presentation

May 01


May 03


Grading Policies

Grades will be weighted as follows:
35%Lab assignments
40%In-class midterms (20% each)
15%Final Project


There will be two midterms, given in lab, as shown on the Schedule. Let me know as soon as possible if you have a conflict with one of the exams.

We will not have a final exam, but we will be using the final exam slot for project presentations. The final exam slot will be released later in the semester.

Lab Policy

Our labs are on Wednesdays, and lab assignments will be generally be due the following Tuesday at midnight. Lab attendance is required, and missing labs will quickly affect your participation grade. Note that Tuesday is my research day and I will be off campus and unable to answer lab questions. Make use of office hours on Monday and Piazza anytime.

Weekly Lab Sessions
CS66 A 1:15—2:45pm Wednesdays Mathieson Clothier 016
CS66 B 3—4:30pm Wednesdays Mathieson Clothier 016

Handing in labs: Lab assignments are submitted electronically and managed using git. You may submit your assignment multiple times, but each submission overwrites the previous one and only the final submission will be graded. Most of the programming/lab assignments will be in pairs. There may also be some written assignments that will have specific instructions for handing in.

Late Policy: Each individual will be given 2 late days for the semester. A late day is a 24 hour extension from the original deadline. You can use one day on two assignments or both days on one assignment. This will encompass any reason - illness, interviews, many midterms in the same week, etc. Past these days, late assignments will not be accepted. You should budget your days to account for future illnesses or assignment deadlines for other courses. Even if you do not fully complete a lab assignment you should submit what you have done to receive partial credit. Late days count against both partners in a group lab.

For extensions beyond these 2 late days (in the case of an emergency or ongoing personal issue), please contact your Class Dean. If your Class Nean notifies me of the issues, then we can arrange an accommodation.

Academic Integrity

Academic honesty is required in all your work. Under no circumstances may you hand in work done with (or by) someone else under your own name. Your code should never be shared with anyone; you may not examine or use code belonging to someone else, nor may you let anyone else look at or make a copy of your code. This includes, but is not limited to, obtaining solutions from students who previously took the course or code that can be found online. You may not share solutions after the due date of the assignment.

Discussing ideas and approaches to problems with others on a general level is fine (in fact, we encourage you to discuss general strategies with each other), but you should never read anyone else's code or let anyone else read your code. All code you submit must be your own with the following permissible exceptions: code distributed in class, code found in the course text book, and code worked on with an assigned partner. In these cases, you should always include detailed comments that indicates on which parts of the assignment you received help, and what your sources were.

Failure to abide by these rules constitutes academic dishonesty and will lead to a hearing of the College Judiciary Committee. According to the Faculty Handbook: "Because plagiarism is considered to be so serious a transgression, it is the opinion of the faculty that for the first offense, failure in the course and, as appropriate, suspension for a semester or deprivation of the degree in that year is suitable; for a second offense, the penalty should normally be expulsion."

The spirit of this policy applies to all course work, including code, homework solutions (e.g., proofs, analysis, written reports), and exams. Please contact me if you have any questions about what is permissible in this course.


This semester we’ll be using Piazza, an online Q&A forum for class discussion, help with labs, clarifications, and announcements. You should have received an email invitation to join CS66 on Piazza. If you didn't, please let me know.

Piazza is meant for questions outside of regular meeting times such as office hours, class, and lab. Please do not hesitate to ask and answer questions on Piazza, but keep in mind the following guidelines:

  1. Piazza should be used for ALL content and logistics questions outside of class, lab, and office hours. Please do not email me your code or questions about the assignments.
  2. If there is a personal issue that relates only to you, please email me.
  3. We encourage non-anonymous posts, but you may post anonymously (to your classmates, not the instructors).
  4. Do NOT post long blocks of code on Piazza - if you can distill the problem to 1-2 lines of code and an error message, that’s fine, but try to avoid giving out key components of your work.
  5. By the same token, when answering a question, try to give some guiding help but do not post code fixes or explicit solutions to the problem.
  6. Posting on Piazza counts toward your participation grade, both asking and answering!

Academic Accommodations

If you believe that you need accommodations for a disability, please contact the Office of Student Disability Services (Parrish 113W) or email studentdisabilityservices at swarthmore.edu to arrange an appointment to discuss your needs. As appropriate, the Office will issue students with documented disabilities a formal Accommodations Letter. Since accommodations require early planning and are not retroactive, please contact the Office as soon as possible. For details about the accommodations process, visit the Student Disability Service website.

To receive an accommodation for a course activity, you must have an Accommodation Authorization letter from the Office of Student Disability Services and you need to meet with me to work out the details of your accommodation at least one week prior to the activity.

You are also welcome to contact me privately to discuss your academic needs. However, all disability-related accommodations must be arranged through the Office of Student Disability Services.

Python style guide From Prof. Tia Newhall
Official Python style guide
Python 3.5 Documentation
Atom editor
Remote access with atom