Homework 7: Measures of sequence diversity
Due: Tuesday, Nov. 24, 11:59pm on Moodle
The goal of this assignment is to implement the measures of sequence diversity that we've been discussing in class. This workflow is very common when a new dataset is encountered. It is shorter than a normal homework so you have time to also work on your final projects.
Here is a template python file to start out with: hw7.py. You can modify the arguments of the functions as necessary. For the questions below, you don't need to submit a separate file, the answers can be in the code file, just make it clear what is going on and what is being computed.
Datasets: (right-click to download)
Compute and report the number of segregating sites (SNPs) for each of these two datasets.
Compute and report π for each dataset. You can either divide by the sequence length or not (these two datasets have the same sequence length so π is comparable either way).
Compute and report the folded SFS for each dataset.
One of these datasets has a constant population size and one has undergone recent population growth. Which is which and why is that the case?