Understanding Generics with Collections

In Java (prior to 5.0), a lot of times you are compelled to downcast your object to a more specific one. For example, when you add a String to a List, and when you want to retrieve your String back then you need to downcast.

List myList = new ArrayList(); 
String str = (String)myList.get(0);  

Downcast is inevitable. Moreover, adding objects of any type to the list is allowed and the developer is responsible to remember the type of each object and perform the appropriate downcast while retrieving them. This gives a way to the type safety problems in Java, as every downcast in the code is a potential case for the wicked ClassCastException. Generics have been introduced to rescue us from these situations, they let you mark the collection to contain elements of a particular data type only, say a List of Strings. The syntax to specify a List of Strings would look like this. (Note: ‘type’ refers to any class definition in Java, e.g. String, Integer, Collection, MyClass etc. are all denoted as types.)

List<String> myList = new ArrayList<String>(); 
String str = myList.get(0); 

The syntax is fairly simple, you need to specify the ‘type’ of the Collection in the angular brackets following the Collection type. Such types are known as Generic types or Parameterized types. We’ll get to know more about defining collection types and substitution in the following sections.

With the above syntax, it is not allowed to add objects or retrieve objects of any other type other than String to the above List and doing that would result in a compile-time error. This is much better than the ClassCastException at runtime, and would definitely save a lot of your development time, isn’t it? And more importantly the downcast should be made extinct now, as the type of the elements within the collection is explicity informed to the compiler through the Generic syntax.

Purpose of Generics
Generics make your program well formed enabling the compiler to perform enough type checks based on the static type information provided and avoids unexpected type errors that could occur at runtime. Let us get into more practical matters.

Collections and Substitution rules
Collections are the primary motivation for Generics in Java.

Let us take a look at some substitution rules. The following are some legal assignments:

List<Integer> li = new ArrayList<Integer>(); 
Collection<Integer> ci = new LinkedList<Integer>(); 
Collection<Integer> cs = new HashSet<Integer>(); 
List lst = new ArrayList<Number>(); 
List<Integer> li = new ArrayList();//warning  

Here is the substitution principle for collections, (Rule 1) RHS of the assignment should contain a Collection implementation compatible with that of LHS and generified with the same type as that of LHS. The last assignment two assignments are valid, these are allowed to provide compatibility of non-generic (prior to Java 5) code with the new generic approach and vice versa. But, if you compile your code with -Xlint:unchecked option, the last assignment results in a unchecked conversion warning. (Rule 2) Do not ignore such compilation warnings, as they indicate your code to be unsafe (could break at runtime with ClassCastExceptions).

List<String> ls = new ArrayList<String>();//1  
Iterator its = ls.listIterator();//4  
while(its.next()) {//5  
    String s = its.next();//6  
    System.out.println("Element: "+s); 

Does this compile? No, Iterator is not a generic type and hence the assignment of iterator’s element to the String ‘s’ (line 6) fails with a compilation error. Basically, in line 4 we lost the type information while obtaining the iterator and so we need an explicit cast here. Hence, you need to make the Iterator parameterized with String type to avoid the explicit cast.

Iterator<String> its = ls.listIterator();//4  

(Rule 3) When you get an iterator, keySet, entrySet or values from a collection, assign to an appropriate parameterized type as shown above. This is because, these methods are modified to return their corresponding generic types to benifit no-cast code. Most of the Java 5 aware IDEs can do this job for you automatically, rely on them.

The following assignments are invalid:

Set<String> ss = new HashSet<Integer>();//Incompatible Types 
List<Object> lo = new ArrayList<String>();//compile-time error  

Though String is a subtype of Object, the second assignment is not allowed. Collection of Objects is a bigger set comprising of elements of various types (Strings, Integers, Cats, Dogs etc.), but a Collection of Strings strictly contains Strings and both of these cannot be equated (Rule 4). In programmatic sense if this were allowed, we would end up adding objects of any type to a List of Strings, defying the purpose of generics and hence this is not allowed.

Well, with the above restriction, how would you implement a method that accepts a collection of any type, iterate over it and print the elements? For such purposes, Wildcards are introduced for generic types to represent unknown collections.

We know that Object[] is the supertype of all arrays, similarly Collection<?> is the supertype of all generic collections which is pronounced as “Collection of unknown”. (Note: Collection<?> represents List<?>, ArrayList<?>, HashSet<?> etc. And Collection<?> is only a reference type and you cannot instantiate it, i.e. new ArrayList<?>() or new HashSet<?>() is not allowed.) (Rule 5) Collections parameterized with wildcards cannot be instantiated.

Using Collection<?> we can implement the iterate and print method as shown below.

public void printElements(Collection<?> c) { 
    for(Object o : c) { 
List<String> ls = new ArrayList<String>(); 
List<Cat> lc = new ArrayList<Cat>(); 

Is Collection<?> same as plain old Collection? No, there are lot of differences between the plain old Collection, Collection<?> and Collection<Object>.

The following are the differences between them:

  • Collection<?> is a homogenous collection that represents a family of generic instantiations of Collection (i.e. Collection<String>, Collection<Integer> etc.)
  • Collection<Object> is a heterogenous collection or a mixed bag that contains elements of all types, close to the plain old Collection but not same
  • Collection<?> ensures that you don’t add aribtrary objects, as we do not know the type of the collection (Rule 6)
  • Collection<?> cannot be treated as a read-only collection, as it allows remove() and clear() operations
  • You can assign Collection<String> or Collection<Number> to a Collection<?> reference type, but not to a Collection<Object> (Refer to Rule 4)
List<String> list = new ArrayList<String>(); 
Collection<?> c = list; 
Object o = c.get(0); //returns "Tiger" downcasted to Object 
c.contains("Tiger"); //returns true 
Iterator itr = c.iterator(); 
while(itr.hasNext()) { 
    Object o = itr.next(); 
c.remove("Mustang"); //removes "Mustang" from the List 
c.add("Dolphin"); //compile-time error (as per Rule 6)

Collection<?> appears very restrictive as you do not known the type information. When you obtain your elements from this collection you need to work with objects and would sometimes end up in explicit cast. So, strictly encourage Collection<?> when you need no type specific operations (Rule 7). But, there would be very few such use-cases in practice, where as more frequently you may need to perform operations on a base interface and you do not bother about the implementation type. In such cases you can benifit with the ‘bounded wildcards’.

Bounded wildcards
List<? extends Number> is an example of a bounded wildcard. This represents a homogenous List that contains elements that are subtypes of Number. Bounded wildcards only indicate an unknown type which is a subtype of Number.

public void addInteger(List<? extends Number> lnum) { 
    Number num = lnum.get(0); 
    byte b = num.byteValue(); 
    lnum.add(new Integer(10));//not allowed, compile-time error 

So, we can obtain elements from the collection assuming the type to be a Number. But, you are not allowed to add anything to the collection as we do not know which subtype of Number the collection contains.

Differences between List<Number> and List<? extends Number>:

  • List<Number> is a heterogenous collection of Number objects (i.e. it can contain instances of Integer, Float, Long, etc.)
  • List<? extends Number> represents a homogenous collection of Number or its subtypes. It is instantiated with any of List<Integer>, List<Float> etc.

“? extends Type” is known as the upper bound, and we also have “? super Type” which is the lower bound where the unknown type denotes a super type of the specified Type. This is rarely useful with collections but could come handy when we define our own generic types.

We’ll see more about defining generic types, generic methods and type erasure semantics in my next posts.


About Deepak Anupalli
Deepak Anupalli is a lead developer and performance expert at Pramati Server Engineering Group.

7 Responses to Understanding Generics with Collections

  1. Deepthi says:

    Nice Article… waiting for ur next posts 🙂

  2. Tommy says:

    This is absolutely superb

  3. San says:

    good work! I will also wait for your next article and possibly ask you to write on a particular thing if you dont mind. I appreciate your work. one more thing how do i keep track of your articles, somebody reply to this in your comments. thanks anyway.

  4. Rajendra says:

    Good one.. Nice explaination abt the wildcards in Collections. Thanks allot

  5. Ori K Muthuswami says:

    Very nice presentation for the easy understanding of Generics.

  6. Deepika says:

    hey nice article…

  7. Kiran says:

    Very Good Article.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: