CSCI 136

CSCI 136 :: Spring 2021

Data Structures & Advanced Programming

Lab 10: Exam Scheduling

This week, you will write a program (that could be used) to schedule final exams for the Registrar so that no student has two exams at the same time. The goals of this lab are to:
  • Gain experience using basic graph building and traversal operations.
  • Develop a fairly sophisticated algorithm requiring several coordinated data structures.

You will use a greedy algorithm to determine an assignment of classes to exam slots such that:
  • No student is enrolled in two courses assigned to the same exam slot.
  • Any attempt to combine two exam slots into one slot would violate rule 1.
The second requirement ensures that we do not gratuitously waste exam slots (students would like to start their breaks as soon as possible, after all).

Program Input

Before describing the solution strategy, let's examine the data. The input to your program will be a text file containing (fictitious) student class information. For example:

            Bill Lenhart
            CSCI 136
            MATH 251
            ENGL 201
            PHIL 101
            Bill Jannen
            PSYC 212
            ENGL 201
            HIST 301
            CSCI 136
            Sam McCauley
            SOCI 201
            CSCI 136
            MATH 251
            PSYC 212

For each student, there will be five consecutive lines. The first line is a student's name, and the next four are the courses for that student. In the above example:

Bill Lenhart is taking CSCI 136, MATH 251, ENGL 201, and PHIL 101;
Bill Jannen is taking PSYC 212, ENGL 201, HIST 301, and CSCI 136; and
Sam McCauley is taking SOCI 201 CSCI 136, MATH 251, and PSYC 212.

We will provide small, medium, and large input files in your starter repository so you can test your program.

Note that whenever you process data in the "real world", your code should carefully handle inputs that are not properly formatted. For the purpose of this lab, however, you can assume that all files are properly formatted and that all students take exactly four courses.

Program Output

The output of your program should be a schedule that satisfies two constraints:

No student is enrolled in two courses assigned to the same exam slot (we can't be in two places at once, after all).
Any attempt to combine two exam slots into one slot would violate rule 1.

This schedule should be provided as a list of time slots with the courses whose final will be given at that slot. For example, the output below describes a valid schedule for the student data above [1]:

            Slot 1: HIST 301 PHIL 101 SOCI 201
            Slot 2: CSCI 136
            Slot 3: ENGL 201
            Slot 4: MATH 251
            Slot 5: PSYC 212

An Overview of the Algorithm

The key to doing this assignment is to build a graph as you read in the file of students and their schedules, where:

Each node of the graph is a course taken by at least one student in the college.
An edge is drawn between two nodes if there is at least one student taking both courses.
The label of an edge could be the number of students with both classes (although we don't really need the weights for this program).

Thus if there are only the three students listed above, the graph would be as given below (edges without a weight label have weight 1).

A greedy algorithm to find an exam schedule satisfying our two constraints would work as follows. Choose a course (say, PHIL 101) and stick it in the first time slot. Search for a course to which it is not connected. If you find one (e.g., HIST 301), add it to the time slot. Now try to find another which is not connected to any of those already in the time slot. If you find one (e.g., SOCI 201), add it to the time slot. Continue until all nodes in the graph are connected to at least one element in the time slot. When this happens, no more courses can be added to the time slot (why?). By the way, the final set of elements in the time slot is said to be a maximal independent set in the graph. Once you have this set, you could remove them from the graph, or you could mark them as visited and then ignore them in future passes. Below we'll assume that we've removed them from the graph.

If there are remaining nodes in the graph (there should be!), pick one and make it part of a new time slot, then try adding other courses to this new slot as described above. Continue adding time slots for remaining courses until all courses are taken care of. Then print the exam schedule. For the graph shown, here is one solution:

            Slot 1: PHIL 101, HIST 301, SOCI 201
            Slot 2: MATH 251
            Slot 3: CSCI 136
            Slot 4: ENGL 201
            Slot 5: PSYC 212

Notice that no pair of time slots can be combined without creating a time conflict with a student. Unfortunately, this is not the minimal schedule as one can be formed with only four time slots. (See if you can find one!) Thus, if a greedy algorithm of this sort gives you a schedule with x slots, no two of which can be combined, a different selection of courses in slots may result in fewer than x slots. Any schedule satisfying our constraints will be acceptable (although see below for extension ideas to compute the optimal solution).

More Details

We recommend that you use a list-based graph implementation rather than a matrix-based one. (Why does that make the most sense for this application? Think about the relative strengths of list-based and matrix-based graph representations as you implement the lab.)

In your GraphListUndirected graph, Vertex labels should be the course names. It likely makes sense to use Integer values for Edge labels, but depending on your implementation, there is no need to keep the Edge labels up-to-date.

You should develop a Student class that stores information about each student. This likely includes their name and the list of courses they are taking (although there are always four courses, you might want a dynamic list, like a Vector to allow for the potential of students with variable courseloads).

Here is one possible way to find a collection of maximal independent sets from the graph (a more concrete description of the greedy algorithm described above): represent each "slot" by some sort of a list (or, if you're implementing some of the extensions, consider a binary search tree). To find a maximal independent set for a slot, pick any vertex of the graph and add it to the list. Then, iterate through all other vertices of the graph; if a vertex is not connected to any of the vertices already in the slot, add that vertex to the slot. Continue until you have tried all vertices. Now delete from the graph all vertices that you added to the slot[2]. Continue to fill successive slots in the same way until there are no vertices left in the graph.

Think carefully about the tasks in this lab. Identify the concrete steps. You should make helper methods that perform each of those concrete steps. In your ExamScheduler.java file, good helper methods will likely be static methods that are passed Vectors, Graphs, or other data structures as parameters. The reason for this design is that there isn't really a strong notion of a logical "ExamScheduler" object, so defining local variables rather global member variables will mean that you have more control over the scope and lifetime of your data structures.

Note that you are required to submit one of the extensions described in the next section. Read through the list of extensions before you design your schedule-building algorithm: The extension(s) you choose to implement may benefit from the choices of data structures that make here!

Extensions

For full credit on this lab, please complete one of the extensions to the program described below. You can choose whichever of the extensions below that you wish. If you would like to explore the problem at a deeper level or apply other concepts we have learned so far this semester, we encourage you to try more than one of the extensions, but you are only required to choose one. Some of these extensions are easier than others, but in our experience, the more challenging extensions are also more interesting!

Print out a final exam schedule ordered by course name/number (ie, AFR 100 would be first, and WGST 999 would be last, if such courses are offered this semester). For each course, print out which slot it is scheduled in.
Print out a final exam schedule for each student, listing students in alphabetical order. For each student, print out each of their courses, and which slot that course is scheduled in.
Randomize! To find a good solution for large files, repeatedly use the greedy algorithm on random orderings of the nodes. After running for a while, you may get lucky and find a schedule close to the optimal, even if you are not guaranteed to find the true optimal. Feel free to explore other approaches as well. As output, list the largest and smallest solution found in a given run.
Arrange the time slots in an order which tries to minimize the number of students who must take exams in three consecutive time slots. This is trickier than the other options!

Feel free to add other features as well. Be sure sure to indicate in the heading of your program (in comments) which extension(s) you have included.

checkstyle requirements:

For this lab, we will be not be adding any new checkstyle rules to the set of rules that we have used in previous weeks.

We STRONGLY ENCOURAGE you to run checkstyle early and often when developing your code, and try to program in a way that minimizes WARNING messages. The checkstyle rules that we use in this course are based on real-world style guides; internalizing good style practices will help us write more readable code.

In total, checkstyle will enforce the following guidelines:

All class variables that are not final must be declared private or protected (i.e., no public member variables unless they are constants). (We don't expect this to be an issue this week.)
All public methods must include a “Javadoc” comment (starts with /** and ends with */; it should include descriptions of the function at the top, descriptions of return values after a @return tag, descriptions of each argument after a @param tag, and pre/post conditions after the @pre or @post tags).
No methods should be longer than 30 lines (excluding whitespace and single-line comments).

To run checkstyle, you would type the following command at the terminal:


            $ ./checkstyle

The ./ is peculiar to Unix: it tells the terminal to look for the checkstyle program in the current directory. This command will run checkstyle on every Java program in your directory. To run checkstyle on a specific Java file, type:


            $ ./checkstyle SomeFile.java

Lab Deliverables

Please complete and submit the following items in your repository:

Your well-documented source code for Student.java and ExamScheduler.java (plus any additional files that you might wish to create for optional extensions).
The files README.md and Rubric.md (there are no thought questions this week, so just be thoughtful when completing your lab!).

Submitting Your Lab

As you complete various milestones, you should commit your changes and push them. Commit early and often. When the deadline arrives, we will retrieve the latest version of your code. If you are confident that you are done, please include "Lab Submission" as the commit message for your final commit. If you later decide that you have more edits to make, it is OK. We will look at the latest commit before the deadline.

Be sure to push your changes to GitLab
Verify your changes on GitLab. Navigate in your web browser to your team's private repository. You should see all changes reflected in the various files that you submit. If not, go back and make sure you committed and pushed.

We will know that the files are yours because they are in your git repository. Do not include identifying information in the code that you submit. Our goal is to grade the programs anonymously to avoid any bias. However, in your README.md file, please cite any sources of inspiration or collaboration (e.g., conversations with classmates). We take the honor code very seriously, and so should you. Please include the statement "I am the sole author of the work in this repository." in the comments at the top your java files.

Footnotes

[1] Different implementations of the algorithm may lead to different results!
[2] Or just mark them as visited and ignore them in future passes. Can you think of a reason why this might be better or worse than deleting them?