CSC 111: Intro to Computer Science through Programming

Lab 11

Due: Friday, Apr. 28 at 11:59pm on Moodle

This lab will focus on our last topic: sorting and search. When dealing with large datasets, the ability to search for specific information quickly is essential, and we use these algorithms every day through online search engines.

For this lab, first find your randomly assigned partner. Introduce yourselves - the person with the first name that comes last alphabetically should begin as the "driver", with the other partner as the "navigator". The driver will have the code open, and the navigator will have these instructions open.

At the end of the lab, email the code (finished or not) and transcript to the person who started as the navigator. If you do not finish during lab you have two options:

(1) Arrange to meet before Friday and finish the lab together.

(2) Continue the code separately and denote the part you did on your own with a comment.

Note: it is not an option for one person to complete the code on their own and then send the finished code to their partner to submit. Any code that you submit should be either written by you, or written by you and your partner while you were pair programming. Both partners should submit their code on Moodle.

Part A: Sorting

In the first part of this lab, we will write a function to sort the elements in a list. After sorting, it will be much easier to find a specific element quickly without going through every element in the list. Start from the file lab11.py, which contains two global lists for testing. Then follow the steps below to write a sorting function.
  1. Function input/output

    Start by writing a function called sort, which will take as input one list, and return a new list which contains all the elements of the input list, but sorted. There are many different algorithms for sorting, but we will implement the one described in the next step.

  2. Sorting algorithm

    The idea behind this sorting algorithm is to first find the minimum element in the list (you may use the built-in function min(lst)). Then add this element on to a new list (this will be our sorted list). Finally, remove the minimum element from the input list. Test out the following code in the shell to see how remove works:

    remove

    Repeat this process (find min, add to new list, remove from old list) until each element has been transfered from the input list to the new list.

  3. Test

    After you've written the sort function (and returned the new list), test your code in main. I would recommend printing the new list as you're building it up to make sure your code is working properly. Sort both test lists and print out the results in main. The first test (and output in blue) is shown below:

    sort

    sort

Switch Driver and Navigator here!


Part B: Searching

In the next part of the lab we will search through a sorted list to find the closest element to our input. For example, if I Google "compuetr science", I'll probably still get results back about "computer science", even though I spelled it wrong. We'll use the same idea here - if we can't find the specific element, we will return the closest result.

Refer to the class notes from Wednesday for more details about the algorithm we will use, which is called binary search. This is a recursive algorithm, so the steps below will detail the base case and recursive call.

  1. Function input/output

    Write a function called binary_search, which will take in two arguments: an item to search for, and a sorted list. This function should return the element in the list that is closest to the given item.

  2. Base case

    Think about the case when we have a list with only one element, called x. So lst = [x]. If I search for an item called k, then k is not in the list, but I'll return x anyway because it is the closest element to k. This is the idea behind the base case. After using recursion to search through smaller and smaller lists, eventually we'll end at a list with one element, and we must return that element.

  3. Recursive call

    If the list has more than one element, (2 or more), we will recurse. Here, we will first need to find the middle element of the list. For example, if our list is: [ham, ram, yam], then the middle element is "ram". Think about how to use the length of the list to obtain the middle element. If the length of the list is even, the middle element can be right before or right after the midpoint, either way is fine.

    Then we want to compare the given item (what we're searching for) to the middle element. If the item is less than the middle element, we want to make a recursive call using the first half of the list. If the item is greater than the middle element, we want to make a recursive call using the last half of the list. The idea is that we're narrowing down the window where the item could be. Think about how to use slicing to obtain the desired portion of the list.

    If the item is equal to the middle element, you can return the middle element directly.

    Switch Driver and Navigator here!


  4. Testing 1

    At the very top of your binary_search function (very first line), I would recommend printing the list. As the function is called recursively, the list size should decrease by a factor of 2 each time. For example, here is a test on the first list with the item "slam". This is not in the list, but we will return the closest element (either "scam" or "spam" would be fine to return here).

    sort

    sort

    We can break this example down step by step:

    • In this example we can see that initially, the item "slam" is greater than the middle element "lamb". So we choose to call our recursive function on the last half of the list (from "lamb" on).

    • In this new list, the middle element is "spam" (could be either "scam" or "spam", but I chose "spam"). Since "slam" is less than "spam", we'll recurse on the first half of the list, which is ["lamb", "ram", "scam"].

    • In this new list, the middle elmeent is "ram". Since "slam" is greater than "ram", we'll recurse on the second half of the list.

    • We continue this process until we have a list with only one element: ["scam"], so we'll return "scam".

    Perform a few more tests as shown below to make sure your recursive function is working correctly.

    sort

  5. Testing 2

    Now test on the second list (the list of students in 111). Design at least three test names and perform these test, printing the results in a similar fashion to the last tests.

When your program is run, the main function should print out both sorted lists and all your testing. Save just this final output into a plain text file called transcript.txt and submit that with your code.

Extensions

  • In addition to this binary search function, also write a linear search function that will take the same types of input (an item to search for an a sorted list). Step through the elements of the list one by one until you get to the "closest" element to the given item. It should return a similar result to your binary search function (off by one for items between two elements is okay).

  • Now that you have a binary search function and a linear search function, we can compare them to see which one is better. For each function, devise a way of counting the number of comparisons performed (i.e. when you use the greater than or less than sign). For a long list (list of students works well), count the number of comparisons for each function. How can you interpret the results?

Submit

Make sure that both you and your partner have a copy of all the code written during the lab period. Both partners should submit the files:

  • lab11.py

  • transcript.txt

on Moodle. If you did not finish, you can either meet to finish or finish separately. All labs must be submitted by Friday night. If you finish early, I encourage you to work on the final project, but you are also welcome to depart.