CIS 260 - Data Structures I

Objectives

Overview

Data structures are one of the foundations upon which programming is built. One of the classic texts in the Computer Science field is Algorithms + Data Structures = Programs. Once the basics of computer programming have been mastered, attention can be turned to learning how to solve problems. Having a good data structure and a good algorithm can make the most difficult problems programmable in a straightforward manner.

Arrays, ArrayList, Vector, LinkedList

Arrays are merely numerous memory locations with the same name, but differentiated by an index. In many languages, including Java, all the elements of an array contain the same type of data. Some languages, such as JavaScript, allow different array elements to contain different types of data.

Arrays are usually unordered (unsorted), but may also be ordered (sorted) at additional overhead cost. An array may become ordered by sorting it, or the elements may be placed into the proper order as they are added to the array. The advantage of using arrays in Java is that they are very efficient for unordered storage.

Arrays have several disadvantages. You have to choose a size in advance of adding elements to the array. If you do not know the size of the array before you start adding elements to it, then you have to choose a size that you know will always be large enough. This is generally wasteful, means that you will have to keep count with a variable how many items you have placed in the array, and can no longer rely on the array's length property for cycling through the array. In additon, adding items to the array and deleting items from the array are inefficient operations since all following items have to be shifted by one position.

An ArrayList is a class in Java that tries to combine some of the efficiency of native arrays with the capability of growing dynamically. An ArrayList is backed up by an actual array. It holds object references, but the autoboxing and unboxing features of Java make it easy to act as if native datatypes such as int and float are being stored and retrieved. One potential problem is that ArrayLists are not thread safe. See TestArrayList.java for an example.

A Vector is similar to an ArrayList in that it can be used like a growable (or shrinkable) array. Although Vectors are efficient at adding or storing elements, a Vector is thread-safe. Making an object thread-safe does require significant overhead. This makes Vectors relatively slow compared to an ArrayList or LinkedList. See Coll1.java for an example.

A LinkedList in Java is very similar to a Vector, except that it is not thread-safe, and it has additional convenience methods which make it easier to implement stacks, queues, and deques. Some of these convenience methods were added in version 1.6 of the language. See TestLinkedList.java for an example.

Linked lists are a classic computer science data structure. They are often used to implement other classic data structures such as queues and deques. You must know some basics of how they are implemented. Check out the Wikipedia linked list article for details. You should know that linked lists consist of nodes which contain data and a pointer to the next node (and possibly a pointer to the previous node in a doubly-linked list). You should understand how inserting and deleting a node is done using manipulation of the pointers. You should also know what is meant by a doubly-linked list and a circular linked list.

Associative arrays

Many languages have a feature known as associative arrays. Sometimes they may be referred to as maps. The basic idea is similar to an array, except that the index does not have to be an integer. There are a number of classes which provide similar functionality in Java. We will take a look at two of them and metion a third.

The first variant of an associative array that we will discuss is the Hashtable class. Hashtable is thread-safe. A Hashtable does not allow null keys or values. See TestHashtable.java for an example of a Hashtable being used.

The second variant is the HashMap class. HashMap is similar to Hashtable, except that it is not thread-safe, which improves its performance. A HashMap allows one null key and any number of null values. See Coll2.java for an example of a HashMap being used.

The third variant is LinkedHashMap. It is not only similar to a HashMap - it actually is a HashMap. LinkedHashMap is a subclass of HashMap which gives you a predictable order for iterators, which is the insertion order by default. So, if you want a HashMap with a predictable iteration order, the LinkedHashMap is probably your best choice.

A demonstration program has been created that uses an array to store and retrieve user data. The user can interactively insert, delete, search, and more, but everything must be done through methods provided by the class containing the array. This program was designed to illustrate a number of Java classes and their methods. The program throws and catches its own exceptions. This program has a fairly important use of a string tokenizer to break-up user input into understandable commands. There is even a crude parser to interpret the commands and execute them. A Hashtable is used to store the commands and their associated actions. See TArray0.java.

A variation of the TArray0 program that uses an ordered array is available as TArray1.java.

Stacks

A stack is similar to an array with a few major exceptions. The major difference is that only the top element is available. Think of it as the stack of dishes at the buffet. Dishes are added to the top and the top dish is the one that is removed. This is known as LIFO (last in first out). Common stack operations include:

Stacks are used heavily in programming. Memory for functions/methods is often placed on a stack in the computer. When the function/method is finished, the memory is used is popped off the stack. The computer keeps track of what code was called by placing references on a stack. This is why you see references to the call stack, why references discuss variables being placed on the stack, and why you might want to print the stack trace when you encounter an exception. An example of an interactive stack is available as TStack0.java.

See the Wikipedia article on stacks for additional details and illustrations.

Queues and deques

Queues are similar to stacks except that the first item in is also the first item out (FIFO). It is similar to the cup dispensers at soft drink machines where the cups are added at the top and removed from the bottom. Although we could still use the terms push and pop, they are so closely associated with stacks that we will instead use the terms add and remove. An example of an interactive queue is available as TQueue0.java.

The next queue demonstration program illustrates how a circular queue operates. Circular queues are often allocated a small contiguous spot in memory. Intead of moving all the elements when something is inserted or deleted, a pointer to the front item and rear item are manipulated. The pointers automatically wrap around when they go past the upper boundary.

Queues are used extensively in programming. Simple circular queues have been used for the longest time to handle I/O tasks, such as buffering keyboard input. A modification to the queue is the priority queue. In a priority queue, the items are automatically sorted. Priority queues are used for such tasks as scheduling processes and threads. See the Wikipedia article on queues for additional details and illustrations.

Deques are another modification of queues in which elements can be added or removed from either end. You should also keep in mind that Java's LinkedList class has convenience methods which make it very easy to implement your own custom stacks, queues, and deques. The interactive demonstration programs presented on this page use less convenient classes.

Demonstration generic stacks, queues, and linked lists

Note: There are two linked list implementations that use generics. They offer slightly different features.