Lab 5: Sorting

The goals of this week’s lab are to gain experience with

  • Java’s Comparator interface, and
  • sorting by various criteria.

PRE-LAB: Step 0

Due Tuesdau, 3/10 by 5pm.

  • Partner preference form. We will assign you a partner for this lab. If you would like to opt out of certain pairings (up to 3), fill out the following Google Form. If you are happy to be partnered with any classmate, you need not fill out the form.

PRE-LAB: Step 1

Before lab, please do the following:

  • Design document: As in prior labs, you should develop a design document before coming to lab. Although you will be working with a partner for this lab, you should create an independent design document so that you and your partner can compare approaches. It is OK if your final design deviates from your design document.
  • Read the assigned pages from Chapter 6 (see the Readings page), and bring your questions to class.

Lab Assignment

In class, you will see how to implement the Comparable interface and how to write a compareTo method to compare two objects of the same class. While this is sufficient in many cases, sometimes we want to sort the same data in different ways. For example, we may wish to sort Student records by name in some situations and by grade point average in other situations. A single compareTo method is not sufficient to do this because compareTo only provides one way to sort such records.

In order to sort objects in multiple ways, the structure5 package uses the Java Comparator interface. A Comparator object’s sole purpose is to compare two objects of a given type and return a value indicating which one is “smaller”—in other words, which object comes before the other in the order imposed by the comparator. We can then sort in different ways by using different kinds of Comparator objects in the sorting algorithm. See Bailey Chapter 6.8—6.9 for a discussion of Comparators and how to use them.

In this lab we will develop an extension of Vector, called MyVector, that includes a method called sort. This sort method orders the elements of the Vector with the help of a Comparator. Here are the basic steps for implementing this new class:


Step 1: Get the code

Clone your team’s private repository cs136lab05_sorting-USERNAMES. This repository contains starter files for MyVector.java and Student.java, plus newphonebook.txt and testphonebook.txt for the last part. The phone book is a few years old, so don’t expect to find yourself in it! Also, this file is not for public distribution. Note that testphonebook.txt is a smaller phone book to make testing easier during development.


Step 2: Implement MyVector.java

Fill in the class, MyVector, which is declared to be an extension of the structure5 Vector class. [footnote: 1] Since we are using generic structures, the class declaration for MyVector should begin:

public class MyVector<E> extends Vector<E>

You should write a default constructor for this class that simply calls super();. This will force the constructor in Vector (the parent/super-class of MyVector) to be called. This, in turn, will initialize the protected fields of the parent class.


Step 3: Implement sort

Inside MyVector.java, construct a new method called sort. It should have the following declaration:

// pre: c is a valid comparator
// post: sort this vector in the order determined by c
public void sort(Comparator<E> c)

Comparator is a Java interface, which, essentially, looks like:

public interface Comparator<E> {
    /*
     * Returns:  < 0  if a is smaller than b
     *             0  if a equals b
     *           > 0  if a is larger than b
     */
    int compare(E a, E b);
}

You will develop some classes that implement the Comparator interface. Comparator is parameterized by the type E of object that it compares. In the case of the sort method, the type of c is Comparator<E>—that is, c must implement a comparator for the type of data (E) stored in the vector. This sort method then uses the Comparator object c to perform comparisons of the values in MyVector. You may use any sorting algorithm you want in your sort implementation, although it may be best to start with a simple algorithm like selection sort. You can be more ambitious once the basic assignment is working.

When writing new comparators, you will specify what type they are defined for. For example, we would define a CardComparator class to compare Card objects as follows:

import java.util.Comparator;

public class CardComparator implements Comparator<Card> {
    ...
}

The compare method in that class would then take two Card objects as parameters.

Be sure to test MyVector thoroughly before going on to the next part. MyVector inherits a toString() method from Vector, which should be handy for printing out the contents of your vectors during testing.


Step 4: Reading the Phone Book

Now, write a program that reads Williams College phone data into a MyVector and answers some questions by sorting it with the appropriate Comparator applied. The file newphonebook.txt contains student entries, represented by three lines, and separated by a line of dashes:

Iluv C Science
Poker Flats B5
4135973427 3334 5406394821
-----------------
Jeannie Albrecht
Thompson Chemistry Lab 304
4135974251 1234 4134581234
-----------------
...

The first line is the name of the student, the second is their campus address, and the third contains the campus telephone number, SU box number, and home (or cell) phone number. You will need to create a Student class to represent all the information for a single student.

Once you have your Student class working, you will write code to read in the data file of student information and create a MyVector of Student objects. Your code should then perform whatever operations are necessary to answer the questions in step 5 below. [footnote: 2]

Note: you should read in phone numbers as long data (using the Scanner’s nextLong() method) rather than int data, because integer variables cannot store numbers much greater than 2 billion due to their internal representation.


Step 5: Answer Phone Book Questions

Your program should print out answers to the following questions. Include the answers to these questions in your PROBLEMS.md file.

  1. Which student appears first in a printed phone book if names are printed as they appear in the data file (i.e., first name first)?
  2. Which student has the smallest SU box number? The largest?
  3. Which student has the greatest number of vowels in their full name? You may ignore “y”s when counting vowels.
  4. Which address is shared by the most students, and what are their names? You may find it useful to build a second vector that stores an Association between an address and the number of students living there. A special Comparator can then be used to sort that vector by comparing the number of students at each address. Once the most common address is known, you can consult the original vector of students to print out the people living at that address.

    Treat dorm rooms as unique addresses. For example, if two students shared a double in Morgan 13, they would have one address; however, those students would not share an address with any student living in Morgan 17.

    Note that some students have the address UNKNOWN because they are abroad, on leave, etc. These students should be ignored for this question. Other student entries with strange formatting should also be ignored (but please let your instructor know if you find any malformed entries).

  5. What are the ten most common area codes for student home phone numbers in decreasing order? A phone number of -1 indicates that information is not available. For this question, you should disregard students without a known home phone number.

Step 6: Textbook Questions

In addition, answer the following questions in a file called PROBLEMS.md, and submit it with the rest of your code this week.

  1. Problem 6.3 (refer to bubbleSort implementation in the text)
  2. Problem 6.4 (ignore the average case)
  3. Problem 6.13 (for those sorts that are not stable, explain why)
  4. Provide a brief, high-level description of the purpose of the code in each submitted Java file.
  5. Finally, demonstrate that your program answers the phone book questions in step 5 by pasting in your program’s output. Please format this nicely so that it is easy for us to see that you correctly answered each question!

Lab Deliverables

By the start of lab, you should see a new private repository called cs136lab05_sorting-{USERNAMES} in your GitHub account (where USERNAMES is replaced by your usernames).

For this lab, please submit the following:

cs136lab05_sorting-{USERNAMES}/
    README.md
    PROBLEMS.md
    MISTAKES.md
    MyVector.java
    Student.java
    newphonebook.txt
    testphonebook.txt

The MyVector.java and Student.java files contain starter code, and you should write all of your functions there. Recall in the previous lab that you had a TestLinkedList.java file that contained a convenient main method pre-populated with a variety of helpful tests. It is always a good practice to create a small set of tests to facilitate development, and you are encouraged to do so here.

As with all labs, you will be graded on design, documentation, style, and correctness. Be sure to document your program appropriately: include pre/post conditions and assertions where appropriate. We will also be looking at how well you organize your code. Whenever you see yourself duplicating functionality, consider moving that code to a helper method. There are several opportunities in this lab to simplify your code by using helper methods. Think carefully!


Submitting Your Lab

As you complete portions of this lab, you should commit your changes and push them. Commit early and often. When the deadline arrives, we will retrieve the latest version of your code. If you are confident that you are done, please use the phrase "Lab Submission" as the commit message for your final commit. If you later decide that you have more edits to make, it is OK. We will look at the latest commit before the deadline.

  • Be sure to push your changes to GitHub.
  • Verify your changes on GitHub. Navigate in your web browser to your private repository on GitHub. It should be available at https://github.com/williams-cs/cs136lab05_sorting-USERNAMES. You should see all changes reflected in the files that you push. If not, go back and make sure you have both committed and pushed.

We will know that the files are yours because they are in your git repository. Do not include identifying information in the code that you submit. We grade your lab programs anonymously to avoid bias. In your README.md file, please cite any sources of inspiration or collaboration (e.g., conversations with classmates). We take the honor code very seriously, and so should you. Please include the statement "We are the sole authors of the work in this repository." in the comments at the top of your Java files.


The squeaky wheel gets the grease

We are always looking to make our labs better. Please submit answers to the following questions using the anonymous feedback form for this class:

  1. How difficult was this assignment on a scale from 1 to 5 (1 = super easy, …, 5 = super hard).
  2. What made you first decide that you wanted to take a programming class?

Bonus: Mistakes

Did you find any mistakes in this writeup? If so, add a file called MISTAKES.md to your GitHub repository and describe the mistakes using bullets. For example, you might write

* Where it says "bypass the auxiliary sensor" you should have written "bypass the primary sensor".
* You spelled "college" wrong ("collej").
* A quadrilateral has four edges, not "too many to count" as you state.

You will receive 1 bonus point on this assignment for each mistake we are able to validate.


Lab notes:

[1] Avoid the statement import java.util.* for this assignment because java.util also provides a Vector class. This lab utilizes the Vector class from structure5. Instead, write import java.util.Comparator to import just Comparator.

[2] While you can write all of your code in the main method of your Student class, we consider this poor form. Instead, create a new class that will be responsible for reading in the data and performing each operation.

  • CSCI 136, Spring 2020