Writing Adaptable Code

I was recently running experiments for a research paper that I'm writing, and I needed to report the mean and the standard deviation for a series of test runs (Good science demands reproducible results!). Let's write a simple program to calculate the average of a group of numbers.

Simple Average: Not Elegant

public class Average {
    public static void main(String args[]) {
        int nums[] = new int[] { 1, 1, 2, 3, 5 };
        double sum = 0.0;
        for (int i = 0; i < nums.length; i++) {
            sum += nums[i];
        }
        System.out.println("The average is :" + sum / nums.length);
    }
}

We have a main method that first creates an array of integers. We declare a double sum = 0.0; to create our "accumulator". We add all of the values in the array into this accumulator, and then we divide by the total array size to calculate the average. It is important that sum is a double, because dividing two int values will truncate the result (truncating has the affect of rounding down).

Aside: the way that int nums[] is declared and initialized may be new to some of us. This is an example of "inline array initialization". I rarely use this syntax myself, but I wanted to show you that you can declare an array and assign it initial values all in one line.

This program is not elegant. What if I want to average a different set of numbers?

Adding an Average Function: Still Not Quite Elegant

I want my code to be modular and reusable. Let's move the task that calculates the array average into a function so that it can be reused elsewhere in my code.

public class Average {
    public static double arrayAverage(int nums[]) {
        double sum = 0.0;
        for (int i = 0; i < nums.length; i++) {
            sum += nums[i];
        }
        return sum / nums.length;
    }
    public static void main(String args[]) {
        int nums[] = new int[] { 1, 1, 2, 3, 5 };
        System.out.println("The average is :" + arrayAverage(nums));
    }
}

One thing you may notice is that the arrayAverage() function is declared static. What does it mean for a function to be static?

We use the static keyword to describe methods or fields that are associated with a class, rather than associated with an instance of the class.
- static variables can be accessed using the class name, e.g., Integer.MAX_VALUE or Math.PI.
- static methods are invoked using the class name, e.g., Integer.parseInt(...) or Math.abs(...)

In the Average class, there is nothing about the method arrayAverage that relies on an Average object's internal state. In other words, it does not need to access any fields or methods that need to be invoked with the keyword this. All of the state that arrayAverage needs to complete it's functionality is either provided as a parameter or created inside the scope of the method itself. It is completely self-contained.

Although this is an improvement over our first version of Average.java, it still sums a hard-coded set of numbers stored in our array.

More Flexibility: Command-line Input

public class Average {
    public static double arrayAverage(int nums[]) {
        double sum = 0.0;
        for (int i = 0; i < nums.length; i++) {
            sum += nums[i];
        }
        return sum / nums.length;
    }
    public static double arrayAverage(String nums[]) {
        double sum = 0.0;
        for (int i = 0; i < nums.length; i++) {
            sum += Integer.valueOf(nums[i]);
        }
        return sum / nums.length;
    }
    public static void main(String args[]) {
        System.out.println("The average is :" + arrayAverage(args));
    }
}

Now, instead of creating an array inside my program, which would requiring me to recompile each time my numbers change, I read my values from the command line and convert from String inputs to Integer inputs. Note that Java lets me overload my functions. I created a new function, also called arrayAverage, except my new function accepts an array of String. Java looks at the name, the number of parameters, and the types of parameters, and it calls the appropriate function (if one exists). To overload, however, my functions cannot be ambiguous.

To handle bad inputs, I should do some exception handling. I can tell Java to try to convert from String to int, but if I get a bad input and Intger.parseInt throws a NumberFormatException, I can gracefully skip that case by "catch"ing the exception.

    try {
        sum += Integer.valueOf(nums[i]);
    } catch (NumberFormatException e) {
        System.out.println("Bad input! " + nums[i]);
    }

Things Get Tricky: Scanner Input

Recall my motivation for this task. I wanted to calcuate the average of the performance numbers in an experiment that I am running. I want to be able to run a bunch of experiments, save the results to a file, and calculate the average later. I can use a Scanner to do that.

File `Average.java`

import java.util.Scanner;
public class Average {
    public static double arrayAverage(int nums[]) {
        double sum = 0.0;
        for (int i = 0; i < nums.length; i++) {
            sum += nums[i];
        }
        return sum / nums.length;
    }
    public static void main(String args[]) {
        Scanner scan = new Scanner(System.in);
        int nums[] = new int[5];
        int i = 0;
        while (scan.hasNextInt()) {
            nums[i] = scan.nextInt();
            i++;
        }
        System.out.println(arrayAverage(nums));
    }
}

File `data.txt`

1 2 3 4 5

To read the contents of a file and send those contents to System.in, I can use the < less-than sign at the terminal. This is a Unix thing, and we go into the low-level details of < in CSCI 432: Operating Systems. Since OS is a 400-level course and it has more prerequisites than any course in the major, we will skip the details for now :)

 ~/:> javac Average.java
 ~/:> java Average < data.txt
 2.4

Hooray! This works.

But what if I don't have exactly 5 numbers? I may exceed the array capacity and my program will crash with an ArrayIndexOutOfBounds exception. Doh!

What I really want is something that acts like an array in almost every way. Except I want that thing to grow and shrink as I add and remove data. What I want is an extensible array!

Vectors

The java.util.Vector class has very comprehensive Javadoc. That code supports many advanced features that are beyond the scope of what we will do in this course (for instance, java.util.Vector is threadsafe... what's a thread?). In this course, we will be using the Vector class that is part of the structure5 package included with the textbook. It has the same interface and general behaviors as the standard Java Vector implementation, but structure5 code has the added advantages of being easy to read, and we can modify the code to play around with on our own systems.

Here is a "fixed" version of our code that uses vectors of integers to store our numbers.

import structure5.*;
import java.util.Scanner;
public class Average {
    public static double arrayAverage(Vector nums) {
        double sum = 0.0;
        for (int i = 0; i < nums.size(); i++) {
            sum += (Integer) nums.get(i);
        }
        return sum / nums.size();
    }
    public static void main(String args[]) {
        Scanner scan = new Scanner(System.in);
        Vector nums = new Vector(5);
        while (scan.hasNextInt()) {
            nums.add(scan.nextInt());
        }
        System.out.println(arrayAverage(nums));
    }
}

Notes:

At the top of the program, you see import structure5.*;. This tells my program to import all classes in the structure5 package, which includes Vector
Instead of the length field, the Vector class has a public method named size()
Instead of the bracket syntax [i], we use get(int i) and set(int i, Object o). In addition, Vector supports add(Object o), which appends an object to the end.
We called the Vector constructor with a parameter (5). This gives the newly created Vector a default size. The vector will grow and shrink as data is added and removed. However, you can call the 0-parameter constructor to create a vector with the default size.
We needed to cast objects to retrieve them from our vector. This is dangerous.

Safer Coding with Generics

You may have noticed if you compiled the code above that you do not get any errors, but you do get a message from the compiler:

Note: Average.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

If we follow those directions (which you always should), we get the following output:

 ~/:> javac Average.java -Xlint:unchecked
 Average.java:17: warning: [unchecked] unchecked call to add(E) as a member of the raw type Vector
         nums.add(scan.nextInt());
   where E is a type-variable:
     E extends Object declared in class Vector
 1 warning

This seems dangerous. One of the nice things about Java being statically typed is that the compiler can give us useful feedback before we run our program. We would like to know if we accidentally use a String when we meant to use an integer at compile-time rather than waiting until run-time to see our error.

Generics are a way we can give the compiler extra information about the type of the values that we are storing in our Vector. When using a generic class, we tell the compiler this type information in two places:

When we declare the variable: Vector<Integer> sums
When we call the constructor: new Vector<Integer>(5);

Now, when we call add, set, and get, for example, the compiler makes sure we are storing and retrieving Integer values. If not, we get an error and can fix it before running our code. Here is a version of the same code with all of the Generic information included.

import structure5.*;
import java.util.Scanner;
public class Average {
    public static double arrayAverage(Vector<Integer> nums) {
        double sum = 0.0;
        for (int i = 0; i < nums.size(); i++) {
            sum += nums.get(i);
        }
        return sum / nums.size();
    }
    public static void main(String args[]) {
        Scanner scan = new Scanner(System.in);
        Vector<Integer> nums = new Vector<Integer>(5);
        while (scan.hasNextInt()) {
            nums.add(scan.nextInt());
        }
        System.out.println(arrayAverage(nums));
    }
}

Note the absence of the cast on the return of nums.get(i);. Casting is dangerous and must be performed with care. Letting the compiler safely check our types is a huge advantage of Generics. There are other advantages that we will discuss later in the course. We will also see how to create a class that uses Generics.