Category Archives: Programming

Debugging JDBC Connections with SQuirreL

Most Java applications use JDBC drivers to connect to a persistent store or database. Different databases however use different JDBC drivers and connection URL formats, and it can be very useful in case of problems to have a way to test that you are using the correct type and version of JDBC driver for your database and the correct connection parameters (connection URL, login credentials, etc) without having to write any code. Squirrel SQL is a universal Java database client that can be used for this purpose.

Squirrel SQL is open source, entirely written in Java, and can be used to connect to any database (SQL or NoSQL) for which there exists a JDBC driver. In this article we will explore how quick and simple it is to test a JDBC connection with Squirrel SQL.

Installation

The first step is to download the Squirrel SQL installer which can be found here:

http://squirrel-sql.sourceforge.net/

Note: if the URL has changed just type “Squirrel SQL client” on your favourite search engine to find it.

The installer is an executable JAR file, therefore you need to have a Java Runtime Environment installed on your machine. To run the installer, double-click on the JAR file or type on the command line:

$ java -jar squirrel-sql-<version>-install.jar

Follow the setup wizard to select the installation directory, keeping the defaults on the plugin selection screen. I also recommend selecting the option to create a desktop shortcut.

Squirrel SQL shortcut icon

Setup

The installation directory will contain the squirrel-sql startup script (.bat in Windows or .sh in Linux). Run the script for your platform or double click the desktop icon to run Squirrel SQL.

Squirrel SQL main screen

Squirrel SQL main screen

We will now use Squirrel SQL to connect to an Oracle database. The same steps can be followed to connect to any type of database, SQL or NoSQL, for which there is a JDBC driver.

Of course you need to know some essential details about the database you are running such as version, the host name or IP address and the port number. If you don’t know how to collect such information consult your database documentation.

The database I am going to connect to in this example is an Oracle Database 11g Express Edition Release 11.2.0.2.0 running on localhost. The Oracle instance identifier (SID) is “xe” listening to port 1521.

Knowing the version of the database you are running will help you identify the JDBC driver you need. Normally database vendors provide a JDBC driver download website. Oracle’s is located at the following address:

http://www.oracle.com/technetwork/database/features/jdbc/index-091264.html

The site lists the drivers for the various Oracle versions. In this case I need to download the driver for version 11g release 2. I will select the JDBC thin driver (type 4) because it is platform independent as it is entirely written in Java and use Java sockets to connect directly to the database. If you are connecting to one specific database the Type 4 thin driver is normally the recommended choice.

Download the JDBC driver for your database and save it locally. The driver I have downloaded is a Java Archive (JAR) file named ojdbc6.jar.

JDBC driver file

Installing the driver

Connecting to a database with Squirrel SQL is quite simple. First you install the JDBC driver for your database, then you create an “alias” where you configure the parameters to connect to a specific instance. Once connected you can browse the database objects or issue SQL commands.

To install the JDBC driver, click on the Drivers tab on the left hand side of Squirrel’s main screen. You will see a list of drivers appearing in alphabetical order, where you need to locate the driver for your database. In this example, we will select (with a double click) “Oracle Thin Driver”.

Squirrel SQL Driver Setup

Squirrel SQL Driver Setup

The following dialog displays showing the driver’s name, example URL and class name:

Driver Setup Dialog

Driver Setup Dialog

Now select the “Extra Class Path” tab, and click on the [Add] button. Browse to the location where you saved the JDBC driver you downloaded in the previous step, and click [Open]. The driver will be listed. Click OK to complete the installation. You will be returned to the driver’s list, and if all is OK a check mark will appear next to the installed driver.

Making the connection

Now that the JDBC driver is installed, the last step is to create an alias to connect to the database instance you are running.

To accomplish this task select the “Aliases” tab on the left-hand side of Squirrel’s main screen, then click on the blue + (plus sign) icon. The “Add Alias” dialog will display:

Coonection Setup Dialog

Coonection Setup Dialog

From the driver’s drop-down list select the JDBC driver you installed in the previous step. Give the alias an arbitrary name that describes the instance you are connecting to, fill the missing details in the JDBC URL (hostname, port number, database name), enter user name and password, then click [Test] to test the connection. Adjust the parameters if necessary until the test succeeds. You now have the correct driver and connection parameters to use in your application or to configure a data source in your application server.

A Brief History of Objects

Most modern programming languages have support for Object-Oriented programming (OOP) but the OOP journey started back in 1967 with the appearance of Simula, the first programming language to use objects and classes.

As the name suggests, Simula was designed to perform simulations. Indeed it is useful to think of OOP as a way to simulate real-world entities (objects) in software.

Most real-world entities can be represented through their state and behaviour, and this is exactly what OOP does. For instance a Person class of objects will have state properties such as name and date of birth, and behaviour (or methods) such as celebrateBirthday.

In 1972 Smalltalk, based on Simula, was created for educational purposes at Xerox and in 1980 it became the first commercial release of an OOP environment.

In 1979 Bjarne Stroustrup began working on OOP enhancements to the C programming language, and in 1985 the C++ language was born bringing OOP into mainstream software development of large-scale commercial systems.

The complexity of C++ motivated a development team at Sun Microsystems to create a pure OOP language that was cross-platform and had automatic memory management, leading to the first release of Java to be announced in 1995.

And the rest, as they say, is history.

How to Implement a Simple HashSet in Java

In the first part of this post we discussed the design of hash-based collections and why classes must obey the general contract stipulated by the equals() and hashCode() methods of the java.lang.Object class. Now we will put in practice what we have discussed by writing a simple implementation of a HashSet in Java.

Our simple HashSet will implement the following interface:

import java.util.Iterator;

public interface SimpleSet {

    // Adds the specified element to this set if it is not already present
    boolean add(Object element);
    // Removes the specified element from this set if it is present
    boolean remove(Object element);
    // Returns true if this set contains the specified element
    boolean contains(Object element);
    // Returns an iterator over the elements in this set
    Iterator iterator();
    // Returns the number of elements in this set
    int size();
}

 
Now let’s move to the implementation details.

Since we will be adopting chaining to resolve collisions, we need an inner class that we can use to build a linked list of collided objects. We will call the inner class Entry.

public class SimpleHashSet implements SimpleSet {

    private static class Entry {
        Object key;
        Entry next;
    }

    // ...

 
The bucket array is an array of Entry instances. We also need a private property to store the size of the Set.

    private Entry[] buckets;

    private int size;

    // ...

 
Our simple HashSet constructor takes a single int argument which defines the capacity of the bucket array:

    public SimpleHashSet(int capacity) {

        buckets = new Entry[capacity];
        size = 0;
    }
    
    // ...

 
Next we need to define a hash function that computes the index into the bucket array giving the hash value of an object:

     private int hashFunction(int hashCode) {

        int index = hashCode;
        if (index < 0) { index = -index; }
        return index % buckets.length;
    }

    // ...

 
We can now start implementing the interface methods.

The algorithm for the add method first computes the index into the bucket array using our hash function.

If there is already one or more entries in that bucket location they are compared for equality with the object to add. If any of the comparisons is positive then the method returns false as a Set cannot store duplicates.

If no duplicates are found than a new Entry is created and assigned to the computed bucket location and linked to the chain if one exists. The size property is incremented and the method returns true to indicate that the object has been added to the Set.

    @Override
    public boolean add(Object element) {

        int index = hashFunction(element.hashCode());
        Entry current = buckets[index];

        while (current != null) {
            // element is already in set
            if (current.key.equals(element)) { return false; }
            // otherwise visit next entry in the bucket
            current = current.next;
        }
        // no element found so add new entry
        Entry entry = new Entry();
        entry.key = element;
        // current Entry is null if bucket is empty
        // if it is not null it becomes next Entry
        entry.next  = buckets[index];
        buckets[index] = entry;
        size++;
        return true;
    }

    // ...

 
The algorithm for the remove method also first computes the index into the bucket array using our hash function.

If there is already one or more entries in that bucket location they are compared for equality with the object to remove. If a match is found the object is removed and the chain, if present, adjusted. The size property is decremented and the method returns true to indicate that the object has been removed.

If no match is found then the method returns false, as there is nothing to remove.

    @Override
    public boolean remove(Object element) {

        int index = hashFunction(element.hashCode());
        Entry current = buckets[index];
        Entry previous = null;

        while (current != null) {
            // element found so remove it
            if (current.key.equals(element)) {

                if (previous == null) {
                    buckets[index] = current.next;
                } else {
                    previous.next = current.next;
                }
                size--;
                return true;
            }

            previous = current;
            current = current.next;
        }
        // no element found nothing to remove
        return false;
    }

    // ...

 
The algorithm for the contains method again first computes the index into the bucket array using our hash function.

If there is already one or more entries in that bucket location they are compared for equality with the object passed as argument. If a match is found the method returns true, otherwise it returns false.

    @Override
    public boolean contains(Object element) {

        int index = hashFunction(element.hashCode());
        Entry current = buckets[index];

        while (current != null) {
            // check if node contains element
            if (current.key.equals(element)) { return true; }
            // otherwise visit next node in the bucket
            current = current.next;
        }
        // no element found
        return false;
    }

    // ...

 
The iterator method simply returns an instance of an inner class that implements the java.util.Iterator interface.

    @Override
    public Iterator iterator() {
        return new SimpleHashSetIterator();
    }

    // ...

 
The code for the Iterator implementation follows:

    class SimpleHashSetIterator implements Iterator {

        private int currentBucket;
        private int previousBucket;
        private Entry currentEntry;
        private Entry previousEntry;

        public SimpleHashSetIterator() {
            currentEntry = null;
            previousEntry = null;
            currentBucket = -1;
            previousBucket = -1;
        }

        @Override
        public boolean hasNext() {

            // currentEntry node has next
            if (currentEntry != null && currentEntry.next != null) { return true; }

            // there are still nodes
            for (int index = currentBucket+1; index < buckets.length; index++) {
                if (buckets[index] != null) { return true; }
            }

            // nothing left
            return false;
        }

        @Override
        public Object next() {

            previousEntry = currentEntry;
            previousBucket = currentBucket;

            // if either the current or next node are null
            if (currentEntry == null || currentEntry.next == null) {

                // go to next bucket
                currentBucket++;

                // keep going until you find a bucket with a node
                while (currentBucket < buckets.length &&
                        buckets[currentBucket] == null) {
                    // go to next bucket
                    currentBucket++;
                }

                // if bucket array index still in bounds
                // make it the current node
                if (currentBucket < buckets.length) {
                    currentEntry = buckets[currentBucket];
                }
                // otherwise there are no more elements
                else {
                    throw new NoSuchElementException();
                }
            }
            // go to the next element in bucket
            else {

                currentEntry = currentEntry.next;
            }

            // return the element in the current node
            return currentEntry.key;

        }

    }

    // ...

 

The implementation of the size method simply returns the value of the size property:

    public int size() {
        return size;
    }

    // ...

 
Finally we override the java.lang.Object toString method to produce a visual representation of our HashSet.

    @Override
    public String toString() {

        Entry currentEntry = null;
        StringBuffer sb = new StringBuffer();

        // loop through the array
        for (int index=0; index < buckets.length; index++) {
            // we have an entry
            if (buckets[index] != null) {
                currentEntry = buckets[index];
                sb.append("[" + index + "]");
                sb.append(" " + currentEntry.key.toString());
                while (currentEntry.next != null) {
                    currentEntry = currentEntry.next;
                    sb.append(" -> " + currentEntry.key.toString());
                }
                sb.append('\n');
            }
        }

        return sb.toString();
    }

    // ...

 
The full source code can be found on GitHub at the following URLs:

SimpleSet (interface)
SimpleHashSet (implementation)
SimpleHashSetTest (unit tests)

This is of course a very basic implementation of a HashSet in Java. It is useful to study and understand the main principles behind hash-based collections, or to code on interviews to demonstrate your knowledge of a hash-based data structure and its algorithms.

If you want to study a production quality implementation with all the required optimizations, I suggest that you take a look at the source code of java.util.HashSet. To find out how to look at the Java library source code in Eclipse please read this post.

Thank you for reading, feel free to post any questions or comments.

Hash-based collections and the Object class

Nick Bit and Dick Byte are aspiring computer scientists, always up to something. This time they get to grips with hash-based collections.

Nick: “Hey dude, let’s build a data structure in Java that does everything in O(1)”
Dick: “Sounds cool. You can’t get any better than constant time”

[Nick and Dick are of course talking about algorithm time complexity measure and the Big O notation]

Dick: “Where do we start?”
Nick: “We could build it on top of an array. If you know the index of an element in the array you can do everything in O(1)”
Dick: “What do you mean?”
Nick: “Take a look at this code. Knowing the index allows me to insert, access and delete an entry in constant time”

int apple_index = 4;

String[] fruits = new String[24];
String apple = "Apple";
// insert
fruits[apple_index] = apple;
// access
String lunch = fruits[apple_index];
// remove
fruits[apple_index] = null;

 

Dick: “I see… but how do we relate an object with an integer number that we can use as the index into the array without hard-coding it?”
Nick: “ As it happens the java.lang.Object class already provides this mechanism for us with the hashCode() method. It attempts to return distinct integers for distinct objects”
Dick: “Cool. So if we have an array big enough we can store any object in it using the hash code as the index and get constant time performance for all operations!”
Nick: “Let’s do it. Why don’t you start writing the code while I get us a coffee?”
Dick: “OK”

[Dick gets started with the code, but don’t try this at home]

Object[] objArray = new Object[Integer.MAX_VALUE];
String banana = "Banana";
objArray[banana.hashCode()] = banana;

 

Nick: “I am back”
Dick: “Wow! I think I just blew the JVM…”
Nick: “Let me see”

Exception in thread "main" java.lang.OutOfMemoryError: 
Requested array size exceeds VM limit

   at com.nickanddick.App.main(App.java:7)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

 

Nick: “That’s daft. We are asking the JVM to allocate memory for 2^16 object instances. That’s more than 2 billion objects!”
Dick: “Wow! That’s a lot of memory”
Nick: “We have been concentrating on time complexity but forgot to take space complexity into consideration”
Dick: “That’s efficient memory usage, right?”
Nick: “Right. We don’t really need an array this big. We just need an array big enough to contain the number of elements required by a particular application”
Dick: “But if we use a smaller array, what happens if the hash code returned for an object is outside the array bounds?”

[Good question. It is time we give Nick and Dick some help]

Bucket Arrays and Hash Functions

To efficiently implement their data structure they need the two fundamental components of a hash-based collection: a bucket array and a hash function.

A bucket array is an ordinary array of a pre-defined capacity (number of elements).

A hash function is a function that translates the hash code of an object into a bucket array index. A simple hash function could for example determine the index by finding the remainder of the division of the hash code by the bucket array length like this:

index = hashCode % array.length

[The % operator is called the remainder or modulus operator]

There a number of considerations to take into account when implementing a hash-based collection.

Collisions

Hash-based collections do not store duplicate values, as duplicate values yield the same index into the bucket array.

There is however a problem. It is possible that a hash function returns the same index for different objects. Suppose we have a bucket size of 128 and we want to add to it two objects obj1 and obj2 whose hash codes are respectively 138 and 266. Applying the simple hash function defined above yields:

Array index for obj1 = 138 % 128 = 10
Array index for obj2 = 266 % 128 = 10

When the hash function returns the same index location for different objects we have what is called a collision. One way to deal with collisions is to use a simple technique called chaining. It works by storing a linked list of objects allocated to the same bucket array index. We will see how that works in practice in the next post.

Load Factor

The probability of collisions is proportional to the capacity of the bucket array. Larger bucket arrays will produce a smaller number of collisions for a given hash function. The relation between the number of elements in a hash collection and the size of the bucket array is called the load factor of the hash collection.

A hash collection with a smaller load factor will produce a smaller number of collisions and therefore will be more efficient.

Object.hashCode() and Object.equals()

Classes whose instances are meant to be used in a hash-based collection should override both the hashCode() and equals() methods of java.lang.Object and obey their general contracts.

The implementation of hashCode() in java.lang.Object returns distinct integers for distinct objects (as much as it is reasonably practical) even when the objects are logically equal and therefore might not produce the expected result when used in a hash-based collection.

https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode–

Likewise, the java.lang.Object implementation of equals() is not ideal as an object is only equals to itself, and should be overriden.

https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#equals-java.lang.Object-

The general contract and algorithms for overriding hashCode() and equals() are described at length in Effective Java by Joshua Bloch. The key rule is that equal objects must have equal hash codes.

In practice all the major IDEs are capable to generate reasonably correct hashCode() and equals() methods for a class.

Other Considerations

Classes whose instances are meant to be used as keys in a Hash Map should also be immutable. It is recommended that the class also implements the java.lang.Comparable interface.

A Simple Hash Set Implementation

In the second part of this post we will look at a simple HashSet implementation to see how it all hangs together in practice. See you then.

Towards Functional Java: Closures

This is the third post in the Towards Functional Java series. In the previous posts we examined how to dynamically define functions for later execution using method references and lambda expressions. The capability to create and manipulate functions at runtime is at the core of functional programming.

Sometimes a lambda expression may access variables that are defined outside of its scope. These are called free variables. Consider the following code:

public class Printer {
	
  public void printSubstring(String text, int beginIndex, int endIndex) { 
    Runnable r = () -> {
      System.out.println(text.substring(beginIndex, endIndex));
    };	
    new Thread(r).start();
  }

  public static void main(String[] args) {
    Printer printer = new Printer();
    printer.printSubstring("Hello!", 0, 4);
    printer.printSubstring("My name is John", 3, 10);
    printer.printSubstring("How are you today?", 4, 11);
  }
}

 

The Runnable implementation accesses three variables that are not defined in the scope of the lambda expression: text, beginIndex, endIndex. These variables are defined in the scope of the enclosing printSubstring method which may have terminated by the time the thread is executed. Normally when a method completes execution its local variables are removed from memory.

The program however executes correctly and the lambda expression somehow retains the values of the enclosing method arguments. A function or block of code along with the captured values of its associated free variables is called a closure.

Java lambda expressions are therefore closures but with an important restriction: they only close over values but not variables. If we try to modify the value of a free variable inside a lambda expression we get a compiler error. Let's modify the Runnable implementation slightly to demonstrate this:

Runnable r = () -> {
  System.out.println(text.substring(beginIndex, endIndex--));
};	

 

We have tried to decrement the value of the free variable endIndex in the lambda expression. The compiler complains with the following message:

Local variable endIndex defined in an enclosing scope must be final or effectively final

Although it is not required that free variables be declared final, effectively they are not allowed to change. In other functional programming languages there is no such restriction, but Java had to reach a compromise since to implement “real closures” would require a massive effort and possibly compromise backward compatibility and/or support for other JVM languages.

The “closure for immutable variables” implemented by Java 8 lambda expressions is a pragmatic solution that still achieves function portability without causing too much inconvenience to the programmer.

In the next post of this series we will look at more Java 8 functional features. Feel free to leave any comments or suggestions. Thank you for reading.

Array Sorting Algorithms in the Java Library

The class java.util.Arrays provides overloaded static methods for sorting arrays of primitive types (byte, char, int, long, float, double) as well as objects. The entire array or a range within the array can be specified.

For each type the library provides two overloaded methods: a single-threaded sort() method and a multithreaded parallelSort() implementation.

It is interesting to analyse the time complexity of the algorithms used in the library and compare its performance against that of a text book merge-sort implementation.

Single-threaded Sorting of Primitive Type Arrays

The Java SE 8 API documentation provides the following notes regarding the implementation of the sorting algorithms for primitive types:

“The sorting algorithm is a Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm offers O(n log(n)) performance on many data sets that cause other quicksorts to degrade to quadratic performance, and is typically faster than traditional (one-pivot) Quicksort implementations.”

If you look at the library source code you will find that the methods actually switch to different algorithms based on the size of the input array. For instance for tiny arrays of less than 50 elements it switches to the insertion-sort algorithm which is very efficient when the number of inversions is small (that is the number of pairs of elements that are out of order).

Single-threaded Sorting of Object Arrays

The library provides an overloaded sort() method for arrays of objects that implement the java.lang.Comparable interface and another that takes a java.util.Comparator implementation as argument if you need to specify custom ordering.

As for the algorithm used the Java SE 8 API documentation provides the following implementation notes:

“This implementation is a stable, adaptive, iterative merge-sort that requires far fewer than n log(n) comparisons when the input array is partially sorted, while offering the performance of a traditional merge-sort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons. Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays”

As noted before the implementation actually switches algorithms based on the size of the input array.

Multithreaded Array Sorting

The algorithm used by the overloaded parallelSort() methods for both primitive and object types still relies on the single-threaded implementations but leverages the Fork-Join framework to obtain parallelism as described in the API documentation:

“The sorting algorithm is a parallel sort-merge that breaks the array into sub-arrays that are themselves sorted and then merged. When the sub-array length reaches a minimum granularity, the sub-array is sorted using the appropriate Arrays.sort method. If the length of the specified array is less than the minimum granularity, then it is sorted using the appropriate Arrays.sort method. The algorithm requires a working space no greater than the size of the original array. The ForkJoin common pool is used to execute any parallel tasks”

Performance Comparison

We will now compare the execution time of the library sorting algorithms for object arrays against a textbook implementation of merge-sort . The source code is on GitHub.

The data used for the tests are arrays of random Integer objects generated by a utility class. The size of the arrays is 10^6 or 1,000,000 elements with a bound value of 10^7 or 10,000,000.

The running time is measured by taking the difference between System.currentTimeMillis() calls before and after the execution of each algorithm.

The JUnit tests are executed by the Maven Surefire Plugin from the command line on a Windows 7 Professional machine based on Intel Core i5-3320M CPU running @ 2.60 GHz clock speed.

These are the results in milliseconds:

Run Merge-Sort JDK sort() JDK parallelSort()
1 966 510 239
2 958 533 241
3 971 505 241
4 989 507 237
5 965 493 237

Whilst the absolute running time is not necessarily indicative as it is heavily influenced by environmental factors, the relative running times should give a good indication of performance differences between the different algorithms when executed under the same conditions.

The full source code can be found on this GitHub repository setup as a Maven project. The development environment used is described here.

Big O Notation Demystified

Our job as programmers is to solve real-world problems by writing algorithms. Often however there is more than one possible solution to a given problem.

For example, suppose you need to implement a Java interface defining a method to sort an array of integers:

public interface Sorter {
  public void sort(int array[]);
}

 
You could use any of the well-known sorting algorithms or even create a new one to solve the problem. But which implementation would you actually use in a production system? Provided there are no computer memory constraints you will want to use the most efficient solution i.e. the algorithm with the best average case performance in terms of running time.

If the size of the array is relatively small the difference in performance is probably negligible but it becomes more and more significant as the size of the array grows. The time taken by an algorithm to execute in relation to the size of the input is called time complexity of the algorithm and is commonly expressed using the big O notation. Evaluating algorithms in this way is called asymptotic analysis in computer science.

We can safely assume for the purposes of this discussion that an algorithm is equivalent to a method or function and that the input to an algorithm is the method or function argument. Measuring the size of the input of course will depend on the type of the argument. If the input is an array or higher-level collection its size is determined by the number of elements in the array or collection. If the input is a String object its size is determined by the number of characters in the string.

Now that we have defined the notion of time complexity let’s see how the big O notation is used to express some of the most common cases.

O(1) = Constant Time

As an illustration let’s consider the following Java method that returns the number of elements in an integer array:

public int elements (int[] array) {
  return array.length;
}

 
Whatever the size of the array passed as argument the time taken by this method to execute will be always the same as it performs only a single operation. The method is said to run in constant-time and its time complexity is expressed as O(1).

O(n) = Linear Time

If the running time of a method increases in line with the input size the method is said to run in linear time and its time complexity is expressed as O(n).

An example of linear time complexity is the following Java method that calculates the sum of all the elements in an integer array:

public int total (int[] array) {
  int total = 0;
  for (int index = 0; index < array.length; index++) {
    total += array[index];
  }
  return total;
}

 
For instance given an input size of 256 this method will perform 256 x operations.

O(log n) = Logarithmic Time

Computers use a binary numeral system in which numbers are represented by sequences of 0 (zero) and 1 (one) also known as the base-2 numeral system.

The base 2 is used everywhere in computer science. For instance to work out how many positive integer numbers you can represent with an 8-bit word you raise the base 2 to the power of 8 to obtain 256 i.e. 2^8 = 256.

The logarithm operation does the inverse of exponentiation. It returns the exponent to which the base 2 must be raised to produce a number. For instance log(256) = 8.

Methods that run in logarithmic time are much more efficient than linear ones. Whereas a linear time method with an input size of 256 will perform 256 x operations, a logarithmic time method will only need to perform 8 x operations with the same input size.

Searching a binary tree is an example of time complexity of O(log n).

O(n log n) = Linearithmic Time

Given an input size of 256 a method that runs in linearithmic time will perform 2,048 x operations:

n * log(n) = 256 * log(256) = 256 * 8 = 2,048

The merge sort algorithm is an example of time complexity of O(n log n).

O(n^2) = Quadratic Time

Given an input size of 256 a method that runs in quadratic time will perform 65,536 x operations:

n ^ 2 = 256 ^ 2 = 65,536

The bubble sort algorithm is an example of time complexity of O(n^2).

The above are some of the most common time complexity measures. You can see more examples on the Big O Cheat Sheet website.

We have been assuming so far that the choice of algorithm is based solely on its time complexity without taking into account any other resource constraints.

However the amount of memory required by an algorithm is also an important decisional factor. The amount of working storage an algorithm needs in relation to the size of the input is defined as the algorithm’s space complexity and it is also commonly expressed using the Big O notation.

There is often a trade-off between time and space requirements, and in practice both need to be taken into account when selecting an algorithm.

Towards Functional Java: Method References

In the previous post of the Towards Functional Java series, we saw how to create a reference to a method implementation using lambda expressions and functional interfaces, passing it as an argument to methods in the same way we are used to pass object references.

Note: to try out the examples in this post, you need a Java 8 development environment. If you do not have one, you can follow this tutorial to set one up.

In the following JUnit test, we implement the functional interface Comparator so that it compares two strings by length (instead of alphabetical order) using a lambda expression, and assign the implementation to the variable comp. We then use the variable name to pass the implementation to the Collections.sort method to sort a List of strings. The API doc can be found here:

https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#sort-java.util.List-java.util.Comparator-

@Test
public void lambdaExpressionTest() {

   Comparator<String> comp = (str1, str2) -> 
       Integer.compare(str1.length(), str2.length());

   List<String> list = new ArrayList<String>();
   list.add("abcd");
   list.add("abcdef");
   list.add("ab");

   Collections.sort(list, comp);

   assertTrue(list.get(0).equals("ab"));
}

 
Now, suppose the above functionality was already provided by a method of an existing class, as in the following example:

public class AlternativeStringComparison {	
   public int compareByLength(String str1, String str2) {
      return Integer.compare( str1.length(), str2.length());
   }
}

 
You can refer to the compareByLength method directly using the following notation:

@Test
public void lengthComparisonTest() {	
   AlternativeStringComparison comp = new AlternativeStringComparison();
      
   List<String> list = new ArrayList<String>();
   list.add("abcd");
   list.add("abcdef");
   list.add("ab");

   Collections.sort(list, comp::compareByLength);

  assertTrue(list.get(0).equals("ab"));	
}

 
The expression comp::compareByLength is called a method reference. In this case we are creating a reference to an instance method, using the notation object::instanceMethodName.

If the instance method is defined by the same class of the receiver object, you can use the notation Class::instanceMethodName. For example, the String class defines an instance method to perform a case-insensitive string comparison. See the API doc:

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#compareToIgnoreCase-java.lang.String-

You can create a reference to it as follows:

@Test
public void caseInsensitiveComparisonTest() {

   List<String> list = new ArrayList<String>();
   list.add("bobby");
   list.add("Andrew");
   list.add("john");

   Collections.sort(list, String::compareToIgnoreCase);

   assertTrue(list.get(0).equals("Andrew"));
}

 
Likewise, it is possible to create a reference to a static method with the notation Class::staticMethodName. Consider the following class:

public class AlternativeStringComparison {	
   public static int compareByLen(String str1, String str2) {
      return Integer.compare( str1.length(), str2.length());
   }
}

 
You can create a reference to its static method as follows:

@Test
public void staticLengthComparisonTest() {
		
   List<String> list = new ArrayList<String>();
   list.add("abcd");
   list.add("abcdef");
   list.add("ab");

   Collections.sort(list, AlternativeStringComparison::compareByLen);

   assertTrue(list.get(0).equals("ab"));	
}

 
Constructor methods can be also referenced using the notation Class::new. We will look at practical applications of constructor references in a later post when discussing Java 8 streams.

In summary, four types of method references have been introduced in Java 8:

Type Notation Notes
Reference to a bound instance method object::instanceMethodName  
Reference to an unbound instance method Class::instanceMethodName  
Reference to a static method Class::staticMethodName  
Reference to a constructor Class::new  

In the next post we will look at more Java 8 functional features. Feel free to leave any comments or suggestions. Thank you for reading.

Towards Functional Java: an Introduction to Lambda Expressions

A major feature of the Java 8 release is the language-level support for lambda expressions, which brings Java into the realm of functional-style programming. Other object-oriented programming languages such as Scala, JavaScript and Python had lambda expressions built in from the start, and now Java has joined the rank.

Why is this interesting? Functional programming (FP) is based on the recursive evaluation of pure functions, whose output depends solely on the input arguments to the function, avoiding side-effects (i.e. changing the state of objects). This immutability results in code that behaves more predictably and performs better in concurrent and parallel systems. This is the first of a series of posts that will explore functional programming in Java, and the starting point is to understand lambda expressions (also known as closures or first-class functions).

So, what is a lambda expression? Put simply, it is a concise syntax to create a reference to a method implementation for later execution. In Java, lambda expressions are used in conjunction with functional interfaces. A functional interface is a plain old interface that defines just one single abstract method, and is marked with the @FunctionalInterface annotation.

It works like this: you define a variable of a functional interface type, and assign a lambda expression to it. This variable can then be passed around just like any other reference, but instead of containing a reference to an object, it contains a reference to a method implementation.

Confusing? Do not worry; it will all become clear with a few practical examples. By the way, to follow the examples in this post, you need a development environment that supports Java 8. If you don’t have one, you can follow this tutorial to get it setup.

For our first example, we will use the Runnable functional interface, whose API documentation can be found here:

https://docs.oracle.com/javase/8/docs/api/java/lang/Runnable.html

The Runnable interface has been around for a long time, and if you wrote multi-threaded programs in Java you will be familiar with it; it is used to define tasks for later execution in a separate thread.

If you look at the source code of Runnable, you will see that it is annotated with @FunctionalInterface. It is not mandatory to use the annotation, in fact any interface that declares a single abstract method can be used as a functional interface; but it is good practice to annotate it so the compiler will check that it is a valid definition.

@FunctionalInterface
public interface Runnable {
    /**
     * When an object implementing interface <code>Runnable</code> is used
     * to create a thread, starting the thread causes the object's
     * <code>run</code> method to be called in that separately executing
     * thread.
     * <p>
     * The general contract of the method <code>run</code> is that it may
     * take any action whatsoever.
     *
     * @see     java.lang.Thread#run()
     */
    public abstract void run();
}

 
By the way, if you don’t know where to find the source code of the Java library classes, this post will show you:

http://robertovormittag.net/learn-from-the-experts-a-look-at-the-java-source-code/

Enough talk. Let’s write some code. Create a Java console application on your IDE, and place this code in main()

public static void main(String[] args) {
   Runnable r = () -> {
   System.out.println("Hello lambda!");
   };

   Thread t1 = new Thread(r);
   t1.start();
}    

 

Code analysis

First, we have declared a variable r of type Runnable, and assigned a lambda expression to it. This automagically becomes the implementation of the run() method declared by the interface.

Next, we have built a Thread object passing to the constructor our Runnable implementation, and started the thread. If you run the app, the thread will run and the string “Hello from a lambda expression!” will be printed on the console.

Syntax analysis

Let’s now take a closer look at the syntax of the lambda expression. All lambda expressions consist essentially of three parts:

  • a set of parameters (in parenthesis)
  • an arrow (or rocket) operator
  • a body (with the method implementation)

In this case, because the run() method takes no parameters, we have an empty parenthesis at the start of the expression.

Let’s look at another example, this time with parameters, and implement the java.util.Comparator functional interface, defined here:

https://docs.oracle.com/javase/8/docs/api/java/util/Comparator.html

This interface declares the abstract method

int compare(T o1, T o2)

which compares its two arguments for order, and returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second.

Note from the API documentation that Comparator also declares other non-abstract, static and default methods but that is OK; it still is a valid functional interface, as it declares only a single abstract method.

We will now implement this interface so that it compares two String objects based on length. Here is the code:

public static void main(String[] args) {

   Comparator<String> comp = (String str1, String str2) -> {
         return Integer.compare( str1.length(), str2.length());
   };

   int result = comp.compare("ab", "abc");
   System.out.println(result);
}    

 
Here we declare the variable comp of type Comparator<String> and assign it to the lambda expression, starting with the two String parameters in parenthesis, followed by the rocket operator, followed by the function body which returns a value. We use the Integer class utility method compare() to compare the length of the String arguments.

https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#compare-int-int-

The variable comp becomes a reference to our method implementation, and can be used in a way similar to the way object references are used. In the next line, we use comp to call our implementation passing two arguments “ab” and “abc”, capturing the return value in the variable result, which is printed to stdout.

If you run the application, you will see the value -1 printed, as expected, since the length of “ab” is less than that of “abc”.

An even more concise syntax

If the compiler can work out the type of the method parameters (in this case String), it is not even necessary to specify them (this feature is called type inference). And, if the method body is limited to a single expression, you can omit the angle brackets {} and the return keyword.

So our implementation can be written even more concisely like this:

public static void main(String[] args) {

   Comparator<String> comp = (str1, str2) -> 
      Integer.compare( str1.length(), str2.length());

   int result = comp.compare("ab", "abc");
   System.out.println(result);
}    

 
If you run the new version of the app, you will get the same result. How much concise you want to be is simply a matter of personal preference and programming style. Personally speaking, I value more much more readability and clarity in the code than conciseness, but the choice is there.

We can make a more interesting use of our Comparator implementation by applying it to the List sorting algorithm in the Collections utilities:

https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#sort-java.util.List-java.util.Comparator-

This utility sorts the specified list according to the order induced by the specified comparator. Try the following code:

public static void main(String[] args) {

   Comparator<String> comp = (String str1, String str2) -> {
      return Integer.compare( str1.length(), str2.length());
   };

   List<String> list = new ArrayList<String>();
   list.add("abcd");
   list.add("abcdef");
   list.add("ab");

   System.out.println(list);
   Collections.sort(list, comp);
   System.out.println(list);
}    

 
Here, we pass comp (the reference to our implementation of Comparator<String>) as the second parameter to the Collections.sort method.

If you run the app, you can check from the console output that, after the call to Collections.sort, the list has been sorted according to our Comparator implementation.

We covered a lot of ground, it is time to summarize. This is what we have learned so far:

  • What is a lambda expression
  • What is a functional interface
  • How lambda expressions and functional interfaces work together
  • The syntax of lambda expressions
  • A couple of practical examples using threads and collection algorithms

In the next post we will look at method references. Feel free to leave any comments / suggestions. Thank you for reading.

Learn from the Experts: A Look at the Java Source Code

The best way to learn and improve coding skills in any programming language is to look at code written by expert programmers. The Java Development Kit (JDK) comes with the source code of the complete Java Runtime Library; it is located in the src.zip archive under the JDK installation directory.

java-src

The location of the source file archive in the JDK 7 installation

Obviously your main reference when using the Java library classes should be always the API documentation, which can be found online or downloaded separately from the JDK downloads website. Here is the URL for the Java 8 API:

https://docs.oracle.com/javase/8/docs/api/

However, it can be very instructive to take a look at the source code of some of the key classes in the library, such as those in the java.lang and java.util packages.

Using Eclipse to view the Java Runtime Library Source Code

First, make sure you have the JDK selected as the default installed JRE. To do this select from the Eclipse menu: Window → Preferences. The following dialog pops up:

Setting up the JDK as the default JRE in Eclipse Mars

Setting up the JDK as the default JRE in Eclipse Mars

Select Java → Installed JREs

If the JDK is not there, click [Add…] and follow the prompts to select your JDK installation directory. Then make this JRE default by selecting the checkbox next to it. Now click [Edit…]. The following dialog pops up:

Setting up the runtime library source attachment

Setting up the runtime library source attachment

Expand the rt.jar library (this is the runtime binary) and click on [Source Attachment…]

The source attachment configuration dialog

The source attachment configuration dialog

Browse to the external location of your src.zip and click [OK]. Apply and close all parent windows.

Now to view the source file of a library class, e.g. LinkedList, either select from the menu:
Navigate → Open Type or use the shortcut [CTRL]+[SHIFT]+[T]

The Eclipse Open Type dialog

The Eclipse Open Type dialog

Type the name of the class, or select it from the list. The source code will open in the Editor.

the Eclipse editor showing the source code for LinkedList.java

The Eclipse editor showing the source code for LinkedList.java

Happy studying!!