Interesting posts 2010-02-06

Published in:  on 2010-02-06 at 17:46 Leave a Comment

Waterfall vs. Agile

I’ve been a fan of Agile methodologies for quite some time now. As an Agilist, I would scoff at the Waterfall process I was taught during my studies. I did read a couple of times that the original paper introducing the Waterfall model wasn’t really supportive of it at all, but I’d never read that paper myself. Until now.

Here’s what I found what its author, Dr. Winston Royce, has to say.

Attitude
The heart of software development is analysis and coding, “since both steps involve genuinely creative work which directly contributes to the usefulness of the final product.” But for larger software systems, they are not enough. “Additional development steps are required [...] The prime function of management is to sell these concepts to both groups and then enforce compliance on the part of development personnel.” The groups mentioned are customers and developers. Wow, not really people over processes, huh?

There are big problems with Waterfall
Royce then goes on to introduce the other steps, ending up with what we now call Waterfall. Right after that he adds feedback loops between each step and its predecessor. The caption to this figure says “Hopefully, the iterative interaction between the various phases is confined to successive steps.” Immediately following that, he points out a problem with this process: “Unfortunately, for the process illustrated, the design iterations are never confined to the successive steps”.

But there is a much worse problem. “The testing phase which occurs at the end of the development cycle is the first event for which timing, storage, input/output transfers, etc., are experienced as distinguished from analyzed. [...] If these phenomena fail to satisfy the various external constraints, then invariably a major redesign is required. [...] In effect the development process has returned to the origin and one can expect up to a 100-percent overrun in schedule and/or costs.” Yes. Been there, done that.

But these can be fixed
Stunningly, though, Royce goes on to claim “However, I believe the illustrated approach to be fundamentally sound.” We just need to tweak it a bit more: add a preliminary design phase before analysis, document the design, do it twice (he means simulate first, but others refer to this as “plan one to throw away”), look closely at testing, and involve the customer.

And these fixes look a lot like Agile
The first trick to do some design before analysis is also what is common in Agile methodologies. However, we usually don’t single out analysis and design, but apply the trick to all the phases. That’s how we end up with Acceptance Test Driven Development/Behavior Driven Development.

Royce turns out to be a big fan of documentation: “In order to procure a 5 million dollar hardware device, I would expect that a 30 page specification would provide adequate detail to control the procurement. In order to procure 5 million dollars of software I would estimate a 1000 page specification is about right in order to achieve comparable control”. Why is documentation so important? One of the reasons is that “during the early phase of software development the documentation is the specification and is the design.” Agilist would rather argue that automated tests are both the documentation and the specification and drive the design. Royce could never have thought of that, since testing in his mind occurred at the end and was to be performed manually.

The do it twice trick is also used a lot in Agile. We call it a spike.

For testing, Royce notices that a lot of errors can be caught before the test phase: “every bit of an analysis and every bit of code should be subjected to a simple visual scan by a second party who did not do the original analysis or code”. Agilists would agree that pair programming is very useful. Also, Royce advises to “test every logic path in the computer program at least once”. He understands it is difficult, but should be done anyway. I agree that we should have (nearly) 100% test coverage, and TDD gives us just that.

For customer involvement, Royce notes that “for some reason what a software design is going to do is subject to wide interpretation even after previous agreement. [...] To give the contractor free rein between requirement definition and operation is inviting trouble.” I don’t see how he can maintain this and still be a fan of written documentation. But I am with him in seeing the value of close collaboration with the customer.

So there is no dichotomy
In summary, the author of the Waterfall process clearly saw some problems with that approach. He even identified some solutions that look remarkably like what we do in Agile methodologies today. So why don’t we end this Waterfall vs. Agile false dichotomy and from now on talk just about software development best practices? Make progress, not war.

By the way, what I find amazing is that somehow people managed to get the Waterfall process out of this paper, but not the problems and solutions Royce presented. And it’s almost criminal that the Waterfall process is still taught in universities as a good way to do software development. Without the above fixes, it’s clearly not.

Published in:  on 2010-01-16 at 21:14 Comments (1)
Tags: ,

Top predictions for 2010

I predict that at the end of the year people will:

  • create posts looking back on the year
  • create posts looking forward to the next year
Published in:  on 2010-01-10 at 19:23 Leave a Comment

Using factory classes in Ant tasks

So you have this nice factory class that prevents your client code from knowing the implementation class of the instances it needs to create and that lets it program to an API only.

Of course, at some point somebody needs to know the implementation class. Since the factory is the one creating instances, it either needs to know itself or be told. And since the factory is probably in the same package as the API, it shouldn’t know the implementation class itself, since that would tie the API package to the implementation package. So the factory needs to be told:

public class MyFactory {

  private static Class implementationClass = null;

  private MyFactory() {
    // Utility class
  }

  /**
   * Create a new instance.
   * @param data Data needed to initialize the instance
   * @return The newly created instance
   */
  public static MyInterface newInstance(final Object data) {
      assertImplementationClass();
      final Class clazz = implementationClass;
      if (data == null) {
        try {
          final Constructor constructor = clazz.getConstructor();
          result = (MyInterface) constructor.newInstance(
              new Object[0]);
        } catch (final Exception e) {
          result = null;
        }
      } else {
        final Constructor[] constructors = clazz.getConstructors();
        for (int i = 0; result == null && i < constructors.length;
            i++) {
          final Constructor constructor = constructors[i];
          if (constructor.getParameterTypes().length == 1
          && constructor.getParameterTypes()[0].isInstance(data)) {
            try {
              result = (MyInterface) constructor.newInstance(
                  new Object[]{data});
            } catch (final Exception e) {
              result = null;
            }
          }
        }
    }

    return result;
  }

  /**
   * Register a class that implements the interface.
   */
  public static void registerImplementation(
      final Class implementation) {
    implementationClass = implementation;
  }

  /**
   * Unregister the implementation class.
   */
  public static void unregisterImplementation() {
    implementationClass = null;
  }

  private static void assertImplementationClass() {
    if (implementationClass == null) {
      throw new IllegalStateException(
          "Implementation class not set");
    }
  }

}

Now, who’s going to tell the factory what class to instantiate? There must be some entry point in the application where this happens. In your tests (you do write tests, right?), you can do that in the set up method. In a web application, you can do that in the ServletContextListener.

Ant

But what about in Ant tasks? You could create an Ant task that does just that and call it from a dependent target:

  <target name="--init-factory" unless="factory.inited">
    <property name="impl.class"
        value="com.mycompany.myapp.MyImplementation"/>
    <taskdef name="register-impl"
        classname="com.mycompany.myapp.ant.RegisterTask"
        classpath="..."/>
    <register-impl classname="${impl.class}"/>
    <property name="factory.inited" value="true"/>
  </target>

However, that doesn’t work. So what’s up?

Debugging Ant tasks

Our Ant task seems so simple that it is hard to see what could be wrong with it. So we want to debug it and find out.

You can debug Ant tasks by setting the environment variable ANT_OPTS:

SET ANT_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,address=6000,server=y,suspend=n

Now when you run your Ant script, you can attach your debugger on port 6000. You may want to use the input task to have the build wait while you attach your debugger.

Debugging reveals something interesting: The registerImplementation method does get called with the right parameter, but when newInstance is called, implementationClass is still null. Apparently Ant is doing some fancy classloader stuff that gets in our way.

The solution is to have the Ant task set a system property that the factory uses:

  private static void assertImplementationClass() {
    if (implementationClass == null) {
      final String className = (String)
          System.getProperties().get(IMPLEMENTATION_CLASS_PROPERTY);
      if (StringUtils.isBlank(className)) {
        throw new IllegalStateException("Implementation class not set");
      }
      try {
        registerImplementation(Class.forName(className));
      } catch (final ClassNotFoundException e) {
        throw new IllegalStateException("Invalid implementation class: "
            + className + "\n" + e.getLocalizedMessage());
      }
    }
  }
Published in:  on 2010-01-08 at 20:20 Leave a Comment
Tags: ,

OSGi & Maven & Eclipse

If you’re involved in a large software development effort in Java, then OSGi seems like a natural fit to keep things modular and thus maintainable. But every advantage can also be seen as a disadvantage: using OSGi you will end up with lots of small projects. Handling these and their interrelationships can be challenging.

Enter Maven. This build tool makes it a lot easier to build all these little (or not so little) projects. Which is a necessity, since a command line driven build tool is essential for doing Continuous Integration. And we all practice that, right?

However, as a developer it’s a pain to keep switching between your favorite IDE and the command line. Not to worry, Eclipse has plug-ins that handle just about any situation. Using M2Eclipse, you can maintain your POM from within the IDE.

But an Eclipse Maven project is not an Eclipse OSGi project. For handling OSGi bundles, one would want to use the Eclipse Plug-in Development Environment (PDE) with all the goodies that brings to OSGi development. There is, however, a way to get the best of both worlds, although it still isn’t perfect, as we will see shortly.

The trick is to start with a PDE project:

Make sure to follow the Maven convention for sources and classes and to use plain OSGi (so you’re not tied to Eclipse/Equinox):

Once you’ve created the project, you can add Maven support:

Make sure to use the same identification for Maven as for PDE:

Now you have an Eclipse project that plays nice with both PDE (and thus OSGi) and Maven. The only downside to this solution is that some information, like the bundle ID, is duplicated.

Published in:  on 2009-11-21 at 11:30 Leave a Comment
Tags: , ,

Ubuntu 9.10 & Eclipse 3.5

I recently upgraded Ubuntu to its latest version (9.10, Karmic Koala) and it works great so far. Except for Eclipse.

I ran Eclipse 3.5 (Galileo), and apparently SWT in that version does something wrong in communicating with GTK. The end result is that buttons don’t react to mouse clicks anymore. Rather annoying. Luckily, there is a solution available. Alternatively, you can use the latest Eclipse 3.6 (Helios) milestone.

But that wasn’t the end of it. Eclipse would now perform extremely slowly on a variety of tasks. It turns out that this is caused by Eclipse now running on the GCJ Virtual Machine. I simply uninstalled everything with “gcj” in its name using Synaptic and all was well again.

Published in:  on 2009-11-05 at 19:37 Comments (2)
Tags: ,

JavaFX for GNU/Linux has arrived

Finally, the time has come: JavaFX is now supported on both GNU/Linux and Solaris.

It’s not really advertised, though, so h Here’s how to get it:

  • Go to the JavaFX website.
  • Click the Download now button. Yes, the one that reads JavaFX 1.1 SDK.
  • Click the JavaFX 1.1.1 1.2 SDK option, and click Download.
  • You’ll be prompted to download javafx_sdk-1_2-linux-i586.sh. Save it somewhere convenient.
  • Make the downloaded file executable with chmod + x
  • Run the shell script with ./javafx_sdk-1_2-linux-i586.sh
  • Page through the annoying legal stuff by pressing Space repeatedly. At the end, type yes.
  • You now have a javafx-sdk1.2 directory that you can play with.

Enjoy!

Oh, and in case you have some JavaFX code from pre-1.2 versions, here’s how to migrate it.

Update: There is also a new Eclipse plugin. Binaries only, the source will have to wait until it gets transferred to eclipse.org.

Published in:  on 2009-05-31 at 21:36 Comments (1)
Tags: , ,

Supporting multiple versions of a data model

As an application evolves, its data model often does too. If you control both, this usually isn’t a problem. However, sometimes your power to change the data model is restricted. This happens, for instance, when the data model is published, and others may depend on it. An extreme case of this is when the data model is defined by another organization as, for example, with S1000D.

Having no absolute control over the data model isn’t much of a problem if you can leave one version behind completely, and move on to the next. But often you won’t be so lucky. I know I’m not: we need to support both S1000D 3.0 and 4.0.

There’s different ways in which you can support multiple data model versions. The one I’m concerned with here, is when your application needs to support multiple data models at the same time with the same code. That leaves out alternatives like having multiple branches of your code for the different data model versions.

One trick that can come to the rescue here is the Once And Only Once rule (also called the DRY principle). When applied to creating instances, this leads to the Factory pattern. If you have all your instances created by a factory, then there’s only one place where you need to decide which class (e.g. the 3.0 or 4.0 version) to instantiate. If those decisions are similar for all the classes in your model, then you could even extract them into a common base class for your factories.

Most of the time, the different versions of the data model will share a lot of similarities. It is tempting to extract those into a common base class. For example, in S1000D there is a type called descriptive data module, and you could derive DescriptiveDataModule30 and DescriptiveDataModule40 from DecriptiveDataModule.

But when the objects in your data model have inheritance relationships themselves, that can get ugly very fast. For instance, a descriptive data module is one of many kinds of data modules, and these data modules share a lot of characteristics. So in code, DescriptiveDataModule would descend from DataModule, and both would have aspects that differ in the 3.0 and 4.0 versions. This spells trouble.

Therefore, it is usually better to use composition instead. So DataModule would have a reference to a DataModuleIssue (where “issue” is used in the sense of the various issues of the S1000D specification, i.e. what I’ve been calling “versions” so far), which the DescriptiveDataModule would inherit. The factory would inject either a DescriptiveDataModuleIssue30 or a DescriptiveDataModuleIssue40 into the DescriptiveDataModule, where DescriptiveDataModuleIssue30 would descend from DataModuleIssue30, and DescriptiveDataModuleIssue40 from DataModuleIssue40.

The idea is to make the Issue classes very bare, dealing only with the stuff that differs between issues, so there is no need for a common base class (although both do implement the same interface). The things that are the same in all issues, go into the core model objects (DescriptiveDataModule and DataModule in our example).

Kanban

Lately, I’ve seen a lot of discussions on Kanban. For those of you who, like me, want to know what all that fuss is about, I collected a couple of links that I will try to merge into a coherent whole below.

So what exactly is Kanban? Literally, it means “visual card”, but that’s not very helpful. This introduction explains that Kanban revolves around a board that visualizes the software development flow.

In fact, flow is a very important concept here. Kanban is a pull system, in which Minimal Marketable Features (MMFs) flow through the development stages when there is capacity available. This contrasts with most Agile methods that push work items into iterations. Also, note that for most Agile methods, those work items (e.g. User Stories) would be smaller than MMFs.

The other big point is that Kanban limits Work In Progress (i.e. the number of MMFs per development stage). This naturally exposes the bottleneck(s) in the flow.
Kanban limits WIP

This leads us nicely to the main reason to use Kanban: to improve your software development process. Other Agile methods deal with process improvement as well, but Kanban is different from e.g. Scrum.

So, if all this sounds cool and you want to give Kanban a shot, then apparently this is how you should get started. If you do, then you may see these effects. Also, make sure to get into a Kanban state of mind.

Update: here is a great compilation of Kanban resources.

Published in:  on 2009-05-27 at 22:30 Comments (1)
Tags: ,

Replacing the word “test”

Elisabeth Hendrickson wants to get rid of the word “test”, as it can mean two different things, which she labels “Check” and “Explore”.

I very much agree with the fact that there are two entirely different aspects to testing. “Checking” is when you get a warm fuzzy feeling when the bar gets green. You perform an experiment and if you get a positive result then you know that all is well.

“Exploring” is different in that you don’t get a warm fuzzy feeling on “green”. In other words, if the experiment produces a positive result, you’re not done yet. You need to look further, until you find a negative result. Only then will you have learned something. And if you spend some time exploring, and find no problems, then there’s always that nagging feeling: is there really nothing to find, or did I just not look hard enough?

So it seems I agree with Elisabeth. Then why this blog post?

Like Elisabeth, I think that words matter. She’s right to want to replace the word “test”. I just disagree with the replacements. “Check” has way too many meanings, and the definitions of “explore” don’t seem to catch what is meant well enough for my taste.

So I’d like to propose an alternative from the world of science: “verify” and “falsify”. Automated tests verify that the software behaves as expected, while exploratory testing falsifies both the expectations and the completeness of the test suite.

What do you think?

Published in:  on 2009-05-20 at 08:17 Leave a Comment
Tags: , , , ,