Java

Typing

One of the most fiercely debated topics in programming languages is the benefit of strong, static typing strategies. Java's strategy opts for as much compile-time checking as possible. Let's take a quick overview of programming language design, in layman's terms. Then, you can put Java into context. When building a language, a designer needs to answer two typing questions relatively early in the design process.

Strong Versus Weak Typing

Strong versus weak typing decides how a type is enforced, or interpreted. In a weakly typed language (like C), variables can be coerced easily, or interpreted as something else. A strongly typed language strictly enforces compatible types across operations. It probably doesn't surprise you that Java is a strongly typed language.

Ruby, Smalltalk, and Python also enforce strong typing, which might surprise you. Many developers believe Smalltalk, Python, and Ruby are so productive because they are weakly typed. They are misinformed. Consider this brief Ruby example:

    irb(main):003:0> i=1
    => 1
    irb(main):004:0> puts "Value of i:" + i
    TypeError: cannot convert Fixnum into String
            from (irb):4:in '+'
            from (irb):4

In the first line, the undeclared variable i takes on the value of 1. At this time, Ruby decides that i is a Fixnum. When Ruby interprets the third line, it sees the + operator after the string, and tries to concatenate i. Of course, Ruby doesn't know how to concatenate an integer to a string, so it throws an error. That's clearly an example of strong typing. (Actually, I've oversimplified things a little. You can dynamically change the definition of Ruby classes and objects at runtime, and this weakens the typing somewhat. Still, on a continuum from strong to weak typing, Ruby would lean slightly to the strong side.)

In a similar situation, a language with weaker typing may instead coerce types to a compatible form, as in C. Consider this example:

    int a = 5;
    float b = a;

In the second line, C coerces the value of the integer to float. Other examples are even worse. In C++, the ( ) cast operator does not yield type safety, so you could say, for example:

    Cat *cat;
    Dog *dog = (Dog *)cat;

These are legal C++ statements. Instead of reporting an error, C++ will happily go on stomping through memory. Languages with very weak typing simply do not capture typing errors, so the behavior of certain operations is undefined. Weaker typing is sometimes convenient, and less predictable. As you've seen, typing is not always black and white. It's also a highly contentious issue among language experts. Strong versus weak typing is on a continuum. Some strongly typed languages like Java allow loopholes by letting the user cast objects to another type. Languages with the strongest possible typing allow no loopholes. Weaker typing allows, and may even require, coercions. The weakest possible typing doesn't do type checking at all at compile time or runtime, like Assembly language, for example.

Static Versus Dynamic Typing

The more interesting question by far is when typing is enforced. Static typing binds a type to an object, and language constructs like variables and parameters. Dynamic typing binds a type to an object at runtime. Dynamic typing doesn't say anything about a variable's container, or anything that a variable passes through. The type is bound to the object. Therefore, the type of containers can change. An imperfect rule of thumb is that static languages force you to declare variables, but dynamic languages don't.

Ironically, most dynamic languages also tend to be strongly typed. Most weakly typed languages tend to be static. Said another way, strong typing can be dynamic or weak, but weak typing is usually also static. You don't find many weakly and dynamically typed languages, beyond Assembly language. Figure 4-1 places programming languages on two axes. Java has strong, static typing. You know this, because you get type mismatch errors when you make certain kinds of mistakes. Compiling this:

    class TypeTest {
      public static void main(String args[  ]) {

        i = 4;                // Nope!!! Static typing

        int j;
        j = 4.2;              // Nyet!!! Strong typing
      }
    }

...gives you this result:

    TypeTest.java:3: cannot resolve symbol
    symbol  : variable i
    location: class TypeTest
        i = 4;

    TypeTest.java:5: possible loss of precision
    found   : double
    required: int
        j = 4.2;

Figure 4-1. Java is a strongly and statically typed language

Sometimes, that's good. After all, a bug that gets caught at compile time takes much less time than a bug that gets solved much later. In general, though, the dynamic programmers that I interviewed said static typing simply mauls productivity.

Syntax

Initially, you immediately can see that Java's syntax forces you to do more work. You have to declare and type all of your variables and parameters. You also need to cast objects that are compatible but different, and convert objects that aren't. The extra syntax provides valuethe compiler has more information to catch bugs earlier. There's a cost, too. Static typing makes you work harder to enter equivalent code to dynamically typed languages, but you also have more lines of code to understand, maintain, or enhance. It's very difficult to prove or disprove the notion that static typing makes you more or less productive in terms of hours at the keyboard, but you can show that static typing leads to more characters, and more code to read and maintain.

Raw code count is not definitive; if it were, Perl, with all of the two- or three-character shortcuts, would be the most productive language of all time. Still, it's suggestive. Java's syntax wouldn't be such a problem if you could limit the extra code to a few lines of code at the top or bottom of a program, but you can't. You need to declare types for parameters. You need a cast every time you take something out of a collection. This syntax only gets more invasive with generics.

Thought Process

Some of the costs related to typing are hidden costs . I believe that one such cost is related to high-level, conceptual work versus finishing work. It's usually preferable to do conceptual work first and finishing work later, because much of your code will be thrown away, especially at early stages. As your program takes shape, you can do more and more detailed work. You make the expected case work, and then you work through noncritical path issues.

Java forces you to work the opposite way: to make things compile, you must deal comprehensively with typing, which involves dominantly finishing issues. Also, many of the compiler errors in Java might not even be problems at all in a dynamically typed language.

Code/Compile Cycle

Dynamic typing comes into play especially when you need to experiment. Remember, you must declare variables in statically typed languages. In Java, that means you need to start each application with a class definition, and it snowballs. You can't just jump in and evaluate a single line of codethe compiler just doesn't have enough information. Instead of just simply evaluating statements, you need to blow out a class, type everything, compile, and execute. In Smalltalk, Lisp, Basic, and Ruby, you can just start typing. For simple experimentation on an initial cut at a Fibonacci sequence, here's the Java version:

    class Fib {
      public static void main (String args[  ]) {
        int x1 = 0;
        int x2 = 1;
        int total = 1;
        for (int i=0; i<10; i++) {
          System.out.println(total);
          total = x1+x2;
          x1 = x2;
          x2 = total;
        }
      }
    }

It's 13 lines; 41 words; 226 characters. Keep in mind that Java forces you to declare the class to explore, and that's what we're measuring here. On the command line, you need to save, compile, and run. The Ruby counterpart looks like this:

    x1 = 0
    x2 = 1
    100.times do
      puts x2
      x1, x2 = x2, x1+x2
    end

It's 6 lines; 16 words; 57 characters. Notice how the code just flows better. Read it in English. But the biggest impact is on experimentation. You just type and go. You can cut and paste right on the console screen. You'll use command retrieval to repeat the lines that you need. And these advantages come into play in IDEs as well. Further, if you need 100 iterations, the Java version breaks because an int is not big enough. Ruby still works fine.

This is a trivial example, and probably not completely fair. After all, the Java version packages up a full class and the Ruby version doesn't need to. But you'll find that as we go on, the examples get more and more compelling, especially for the dynamic, reflective style of programming that leading Java developers seek today.

As you add the Web and other deployment steps, the case for dynamic languages gets more compelling, because you can make changes and immediately see the results, instead of having to compile and deploy, and maybe even bounce your servlet engine. Web-based programming gets very easy. Just make a change, and hit Reload.

From my small forays into Basic (where I made my spending money in high school by writing games) and Smalltalk (where I did marketing demos), I miss the rapid feedback cycle afforded by dynamic typing and an interpreter the most.

Adaptability

If you've been coding in Java for most of your career, you probably don't know that you have to jump through so many hoops just to support static typing, but you do. One of the greatest typing costs comes into play when you refactor. Think of the impact of a simple type change for a member variable. You've got to change the property, the getters and setters, every type cast, and every parameter usage. Dynamically typed languages delay the binding of a type to a variable or a parameter, so you often don't need to make any change at all to support a simple type change. For Smalltalk, for example, you can change a type easily. If the new type supports all of the messages of the old type, you will likely limit the changes to one place in your code.

Generics

The Java architects have traditionally gone to great lengths to ensure type safety, but there's been one particular case that's troubled them. When you take an object from a collection, you need to cast the object:

    ArrayList animals = new ArrayList(  );
    animals.add("elephant");
    String cat = (String)animals.get(0);

The compiler has just lost the ability to provide compile-time type safety. You could call the array element anything you want. To fix this, Java introduces an ugly implementation of a feature called generics . Here's what the usage looks like:

    ArrayList<String> animals = new ArrayList<String>(  );
    animals.add("elephant");
    String elephant=animals.get(0);

Comparing the preceding code with its nongeneric equivalent, you may think that you avoided casting, but you really did not. Java introduced an ugly implementation of generics, called type erasure. Under the hood, in the modified version, the ArrayList still maintains a collection of Objects and not a collection of Strings. Of course, any library that you need to strongly type with user-defined types must enable the code for generics. Enabling generics gets a little ugly. Here's the List declaration from within the Java collections package:

    public interface List<E> { void add(E x);
    Iterator<E> iterator(  );
    }public interface Iterator<E> {
      E next(  );
      boolean hasNext(  );
    }

If you're not a fan of statically typed languages, you don't like the extra type checks that place yet an additional burden on you. Even if you like the idea of generics, you probably don't like the implementation. Generics offer only syntactic sugar and not real runtime protection, because the JVM has no concept of generics. In an article series on agiledeveloper.com, Venkat Subramaniam lays out the problems in gory detail:

Ted Neward: Generics

Author of Effective Enterprise Java

Ted Neward is an independent consultant specializing in high-scale enterprise systems. He is an author, teacher, and consultant, focusing on Java .NET interoperability. He has written several widely recognized books in both the Java and .NET space, including the recently released Effective Enterprise Java (Addison Wesley). He lives in the Seattle area with his wife, two sons, two cats, and eight PCs.

What's wrong with Java, in general?

TN: Hordes of developers are writing code that doesn't fit well with the tools and technologies they're using to build applications, pronouncing the tools and technologies "ugly and unusable" and going off to reinvent the wheel.

What's wrong with Java 1.5 ?

TN: Java 1.5 demonstrates a general attitude against progress, and Sun adamantly refuses to advance the JVM whatsoever, preferring instead to maintain the fiction that the Java language and the JVM are one tightly coupled entity.

Do you like the implementation of generics?

TN: No. The fact that they're implemented at a language level, rather than at the JVM level, means that under the hood, it's all still just Object references, so:

  • Other languages have no concept of generics.

  • We get no performance boost from generics.

  • We have to have some sneaky backward compatibility that still permits use as Object references (which you might argue would be necessary anyway, and I'll suggest that the Object-reference versions should be deprecated in 1.5 and removed in 1.6).

Overloading

In some ways, Java's typing problems are exacerbated by another limitation described as a feature: method overloading . Taken alone, overloading is not a huge problem, but Java developers use overloading to enable an API that supports multiple types. You've got a surefire recipe for API bloat.

Need an example? Take the java.util.Array interface. Please. For convenience, you get more than 70 methods. Peel back the onion, and you see they cover only 10 or so pieces of actual, distinct functionality. With a smarter method declaration, you'd be able to specify parameters with keywords, and default unused parameters to an intelligent value, like 0 or null.

Other Costs

When you decide to type everything, it's a slippery slope. When you need to pull back from Java's typing system, you can't always do so. You're starting to see many examples of Java libraries working around the typing in unusual ways. Study the JMX interface for an excellent example. Does it use strong typing? It appears that way, at first. Then you dig in a little and find what only can be conceptually described as an embedded type systema mini-language, embedded in a String parameter called ObjectID, with a complete language description in the JavaDoc and syntax completely opaque to compilers and interface generators and processors. Java's type system failed here. JMX architects bypassed the type system, building metadata into strings and other objects. If you look around, you'll find other examples of this as well. Most often, Java hides weaker types, or dynamic types, as strings.

The Benefits of Static Typing

After reading about all of the negatives, you're probably wondering why anyone would ever opt for strong, static typing. There are at least two compelling reasons to do so. Static typing reduces certain types of errors (like misspelled variable names), and provides more information for your IDE and other tools. (Most security-related typing arguments refer to weak typing, not dynamic typing.)

Take the following application. Java will catch this error at compile time:

    int consumer;
    if (conusmer =  = 0) return consumer;  //spelling error

It's hard to imagine a dynamic language, with rigorous unit testing, letting an error like this through, though. The IDE problem is a little bit more obscure. Many of the features that Java developers have come to depend on, like method completion, rely on information in a variable's type. You can't always get the same contextual information out of a Ruby or Smalltalk IDE.

A Safety Net with Holes

The Java founders most often cite the ability to catch type mismatch errors at compile time rather than runtime. That's interesting to me, because of all the Smalltalk and Ruby developers I interviewed, few have ever had significant problems with type mismatch errors. Of course, most of them lean pretty heavily on automated unit testing, as we all should. You need to unit test code regardless of whether you use dynamic typing. No compiler can guess your intent perfectly. Even if you like the generics implementation, you've got to be concerned with an implementation that's little more than syntactic sugar, with no JVM implementation behind it.

With the heavy use of test-driven development, the argument for reduced bugs is much less compelling. In fact, Java's type safety is not as encompassing as the founders would lead you to believe. At any given time, most of the objects in a typical Java application reside in collections. Any time you remove one of these objects from its collection, you need to cast up from Object. You're effectively retyping an object. If you cast it incorrectly, glass will break in the form of a class cast exception, at runtime. At the same time, improved tools and emphasis on automated unit testing make it much easier to catch type problems in dynamic languages long before they ever reach production. My experience tells me that Java's type safety is not as important and comprehensive as most programmers think it is, and the typing in more dynamic languages, with unit testing, is not as limiting.

The IDE code completion problems presented by dynamic typing will probably get solved by a combination of better browsers and smarter context. Unit testing will make type safety less useful from a program correctness standpoint. In the end, for application programming, more dynamic typing will prevail. The productivity gains due to dynamic typing are too compelling to ignore.