CSC 390: Topics in Artificial Intelligence

Lab 5: Gaussian Mixture Models

in-class

In this lab we will be visualizing a Gaussian Mixture Model (GMM) with different covariance matrices. If your project involves k-means and the results are not very interpretable, GMM could be a great option for comparison.

For this lab, please work with your randomly assigned partner, with code on one computer. Make sure to email each other the code afterwards.

Credit: based on this Iris flower clustering exercise.

Step 1: Load the data

First, download lab5.py and look at the code. The starter code is similar to Lab 3 with the Iris flower dataset, except we will be plotting the first 2 features (out of 4) instead of the first two principle components. First load this data to obtain X and y, the same way we did for Lab 3. Run the code and make sure the scatter plot shows up.

Step 2: Gaussian mixture model

The next step is to create the Gaussian mixture model and fit it to our data. Here is the

GMM documentation

Create an instance and then use "fit" with our features (X). Use 3 components and the argument:


covariance_type='spherical' # use spherical covariances to start

Step 3: Model parameters

After fitting the model, print the means and covariances. Make sure the dimensions make sense.

Step 4: Plot the covariances

Using the function provided, we can visualize the covariances. To call this method, pass in your GMM object, as well as the "axes" for your graph.


make_ellipses(gaussian_mixture_model, ax) # pass in your model object

Make sure you can visualize the results. The colors might not match up correctly (you could run it a few times until they do) since this is an unsupervised learning approach. Does this agree with our notion of a spherical covariance matrix?

Step 5: Different covariances

Now change the covariance matrix type to 'diag' (diagonal covariance matrix) and run the results again. Does this look like a better or worse fit? Do the same thing for 'full' and 'tied' and comment on the results.

Step 6: Discuss projects

If you have time at the end, tell each other about your final project topic and what you plan to do first with the data. Provide some constructive feedback to each other.

Step 7: PCA

If you have even more time, run PCA on the data first, then redo the covariance experiments, using the transformed data for visualization (think about whether to fit on the original data or the transformed data).