"Unsupervised Machine Learning"
Fall 2016
This course will begin with a brief introduction to artificial intelligence (AI) and how the material in this course fits into the overall field of AI. We will discuss the difference between supervised and unsupervised learning, starting with a few key methods in supervised learning. Then we will move on to the main focus of the course. Unsupervised learning seeks to uncover underlying structure in a dataset or system, without the use of labeled data. We will explore unsupervised learning methods from a variety of angles, including theory, implementation, application, existing software, and recent literature. Throughout the course we will investigate a variety of datasets, with an emphasis on learning from "big data" (e.g. natural language and biological datasets).
Class meetings will be a combination of interactive lecture, mini-labs, oral presentations by the students, and discussion of research papers. Homeworks will be a mix of programming assignments, readings, and pencil-and-paper exercises. There will be a mid-semester oral presentation (15-20 minutes) and a mid-semester written literature review in an area of the student's interest. Building on this work, during the last third of the course, students will explore a topic of their choice, which will include an final oral presentation and a written report. In all aspects of the course, there will be a focus on effective communication of ideas and questions in an multidisciplinary context.
The programming aspects of assignments will generally be in Python, but any language is welcome for the final project. Homeworks will be submitted online through Moodle.
PDF and associated datasets available online:
We will be using Piazza for online class discussion, homework help, announcements, clarifications, etc. Our class page is:
Questions about course content that apply to the whole class should be posted (non-privately) on Piazza. Individual questions about projects or presentations are fine over email.
Do not email me or post a long blocks of code on Piazza. If you can distill the problem to 1-2 lines of code and an error message, post on Piazza.
Each student may take a 3-day extension on one assignment throughout the semester (except for presentations). No other late work will be accepted. The only exceptions to this policy are:
Electronic devices may be used in class as long as they are directed towards course material (taking notes, in-class lab, etc).
Two class meetings may be missed without affecting your participation grade.
Collaboration is encouraged in this course, especially because different backgrounds and skill sets are necessary for making progress in a research setting. Additionally, for this capstone course, one of the goals is to learn where to look for information and how to use available resources. However, code and written work should be produced and understood by each individual student. For each assignment, please cite your classmate collaborators, books, and online resources, as per the Smith College honor code:
"Smith College expects all students to be honest and committed to the principles of academic and intellectual integrity in their preparation and submission of course work and examinations. All submitted work of any kind must be the original work of the student who must cite all the sources used in its preparation."
Additional Resources