CSCI 136 :: Spring 2021

Data Structures & Advanced Programming

Home | Schedule | Labs | Handouts | Links | CS@Williams

Lab 6: Comparators & my Favorite Sort of Vectors

The goals of this week's lab are to gain experience with
• Java's Comparator interface, and
• to use Comparator objects to sort by various criteria.

Important note: This week you may again work with a partner.

Pre-lab: Step 0

Before lab, please do the following:

Lab Assignment

In class, we saw how to implement the Comparable interface and write a compareTo method to compare two objects of the same class. While this is sufficient in many cases, especially when there is a "natural" way to order objects of a given class, other times we will want to sort the same data types in different ways. For example, we may wish to sort Student records by name in some situations and by grade point average in other situations. A single compareTo method is not sufficient to do this, because compareTo only provides one way to sort such records.

In order to sort objects in multiple ways, the structure5 package uses the Java Comparator interface. A Comparator object's sole purpose is to compare two objects of some given type and return a value indicating which one is "smaller"—that is, which comes before the other in the order imposed by the comparator. We can then sort in different ways by using different kinds of Comparator objects in the sorting algorithm. See Bailey Chapter 6.8–6.9 for a discussion of Comparators and how to use them.

In this lab we will develop an extension of Vector, called MyVector, that includes a new method sort that will (with the help of a Comparator) order the elements of the Vector. Here are the basic steps for implementing this new class:

Step 1: Get the Code

Clone your team's private repository. This repository contains starter files for MyVector.java and Student.java, plus phonebook.txt for the last part. The phone book is a few years old, so don't expect to find yourself in it (although you might recognize some of the names that you do find)! Note: this file is not for public distribution.

Step 2: MyVector.java

Fill in the class, MyVector, which is declared to be an extension of the structure5.Vector class[1]. Since we are using generic structures, the class header for MyVector will be:

              public class MyVector<E> extends Vector<E>
    

You should write a default constructor for this class that simply calls super();. This will force the constructor in Vector (the parent/super-class of MyVector) to be called. This, in turn, will initialize the protected fields of the parent class.

Step 3: sort

Inside MyVector.java, construct a new method called sort. It should have the following declaration:


	/**
	* @param c a comparator that is used to order objects in this Vector
	* @pre c is non-null
	* @post this vector's contents are ordered as determined by c
	*/
	public void sort(Comparator<E> c) { ... }
    

Comparator is a Java interface, which, essentially, looks like:


	public interface Comparator<E> {
	  /**
	  * @param a an object to compare
	  * @param b an object to compare
	  * @return  (< 0) if a is smaller than b; (0) if a equals b; and
	  * (> 0) if a is larger than b.
	  */
	  int compare(E a, E b);
	}
    

You will develop some classes that implement the Comparator interface. Comparator is parameterized by the type E of object that it can compare: in the case of the sort method, the type of c is Comparator<E>—that is, the c object must implement a comparator for the type of data (<E>) stored in the vector. This sort method then uses the Comparator object c to actually perform any needed comparisons of the values in MyVector. You may use any sort algorithm, although it may be good to start with a straightforward approach, such as selection sort. You can be more ambitious once the basic assignment is working.

When writing new comparators, you will specify what type they are defined for. For example, we would define a CardComparator class to compare Card objects as follows:


        import java.util.Comparator;

        public class CardComparator implements Comparator<Card> {
          ...
        }
    

The compare method in that class would then take two Card objects as parameters.

Be sure to test MyVector thoroughly before going on to the next part. MyVector inherits a toString() method from Vector, which should be handy for printing out the contents of your vectors during testing.

Step 4: Reading The Phonebook

You are now going to write a program that reads Williams College phone data into a MyVector and then answers some questions by sorting it with Comparators applied to the student entries. The file phonebook.txt contains student entries, represented by three lines, and separated by a line of dashes:


        Iluv C Science
        Poker Flats B5
        4135973427 3334 5406394821
        -----------------
        Steve Freund
        Thompson Physics Lab 302
        4135974260 1234 4134581234
        -----------------
        .
        .
        .
    

The first line is the name of the student, the second is their campus address, and the third contains the campus phone, su box, and home (or cell) phone. You should create a Student class which represents all the information for a single student.

Once you have your Student class working, you will write code to to read in the data file of student information and create a MyVector of Student objects; after this your code should implement logic to answer the questions listed below. Here are some implementation hints and details:

Step 5: Phonebook Questions

The ultimate goal of your program is to print out answers to the following questions. When we run your program, we expect this output to be presented in a clear, easy-to-read way. This means you should include context with each answer (e.g., rather than printing "7", you should print "There are 7 total Bill's in the student phone book.") and you should make sure that each answer is differentiable (i.e., we should easily be able to tell where one answer ends and the next begins!). In addition to printing these answers as program output, please include the answers to these questions for studentphonebook.txt in your PROBLEMS.md file.

  1. Which student appears first in a printed phone book (i.e., a phone book that has entries that are sorted by name), comparing names in the format in which they appear in the data file (i.e., first name first)?
  2. Which student has the smallest SU box? Largest?
  3. Which student has the most vowels in his or her full name? (You may ignore "Y"s when counting vowels.)
  4. Which unique address is shared by the most students, and what are their names? (Note: you should not consider "Greylock" as an address; individual dorm rooms, houses, or coops have unique addresses.) You may find it useful to build a second vector storing Associations between each address and the number of students living there. A special comparator can then be used to sort that vector by comparing the number of students at each address. Once the most common address is known, you can consult the original vector of Students and print those living at that address.
    Some students have address UNKNOWN because they are abroad, on leave, etc. These students should be ignored for this question. Any other student entries with strange formatting, should also be ignored. (But please let your instructor know if you find any weird entries.)
  5. What are the ten most common area codes for student home phone numbers, in decreasing order? Some phone numbers are -1 to indicate that the actual information is not available. You should simply disregard—for this question—students without a known home phone number.

When designing your solution, it is a good idea to reuse your vector of student data. Answering each question will likely involve sorting the vector; once you have answered a question, you can resort the vector to answer the next question. It is wasteful to create a new vector by reading in the phonebook repeatedly. If you think that you DO need auxiliary data to answer a question, that is fine. Just avoid duplicate data when you can.

checkstyle requirements:

For this lab, we will be not be adding any new checkstyle rules to the set of rules that we have used in the previous week. In fact, solutions to this lab are likely going to be less modular than past last due to the structure of the task: we are writing code that answers a specific question. So try to use good design principles when possible (e.g., move code that implements a concrete task into a helper method, apply public/private/protected visibility modifiers as appropriate, use abstraction to hide the details of your implementation, etc.), but also recognize that we may be limited by the specificity of the task.

We STRONGLY ENCOURAGE you to run checkstyle early and often when developing your code, and try to program in a way that minimizes WARNING messages. The checkstyle rules that we use in this course are based on real-world style guides; internalizing good style practices will help us write more readable code.

In total, checkstyle will enforce the following guidelines:

Step 6: Textbook Questions

In addition, answer the following questions in your PROBLEMS.md file, and submit those answers with the rest of your code this week.

Lab Deliverables

For this lab, please submit the following:

If you worked with a partner, submit one folder with your collaborative solution.

As in all labs, you will be graded on design, documentation, style, and correctness. Be sure to document your program appropriately: include pre/post conditions and assertions where appropriate. We will also be looking at how well you organize your code. Whenever you see yourself duplicating functionality, consider moving that code to a helper method. There are several opportunities in this lab to simplify your code by using helper methods. Think carefully!

Submitting Your Lab

As you complete portions of this lab, you should commit your changes and push them. Commit early and often. When the deadline arrives, we will retrieve the latest version of your code. If you are confident that you are done, please use the phrase "Lab Submission" as the commit message for your final commit. If you later decide that you have more edits to make, it is OK. We will look at the latest commit before the deadline.

We will know that the files are yours because they are in your git repository. Do not include identifying information in the code that you submit. We grade your lab programs anonymously to avoid bias. In your README.md file, please cite any sources of inspiration or collaboration (e.g., conversations with classmates). We take the honor code very seriously, and so should you. Please include the statement "I am the sole author of the work in this repository." in the comments at the top your Java files.

 

Lab notes:

  1. Avoid the statement "import java.util.*" because java.util also provides a Vector class, which might lead to unanticipated results. Instead explicitly import java.util.Comparator