Skip to content

Stupid Software Engineering Question of the Week…

November 3, 2008

My basic question is this; should my objects to have a default constructor, and if so, what should it do?

Say you have a class GeoLoc, for example, that represents a point on the surface of the earth. You’ll want a constructor GeoLoc(Longitude, Latitude) so that you can construct GeoLoc objects to represent specific place, e.g. GeoLoc(53, 27), GeoLoc(-12, -33) and so on.

But what about when you want a GeoLoc object, but don’t (yet) know where you want it? What values should a new GeoLoc() have?

For the GeoLoc (and many other) objects, it doesn’t make sense to have a default constructor that set’s the object to a valid state. GeoLoc(0,0) is a real place, and in general you want GeoLoc(0,0) != GeoLoc() to yield true. In which case, having GeoLoc() construct an object at (0,0) is the wrong thing to do.

That said, is it better to provide default constructors that construct “null” objects, or is it better to just remove default constructors altogether and force the programmer to construct a well-formed object in the first place?

In some cases, a null object is definitely useful. Bounding boxes for example – it’s useful to have the concept of an empty bounding box that you can then e.g. union with another bounding box and yield sensible results. (Having a special Null Bounding Box value is similar to special numbers like Integer.MIN_VALUE and Double.NaN and so on, although I believe I’m correct in saying that both Double and Integer default-construct with zero as their value (in Java, at least). Maybe Double d = new Double; should yield Double.NaN? But that’s a whole other discussion…)

So, to GeoLoc() or not to GeoLoc()? One advantage of having the default constructor available is that objects that contain a GeoLoc can them themselves be default-constructed. I can’t think of a particularly compelling example off the top of my head, but just because I can’t think of one doesn’t mean that someone cleverer than me won’t come along and want to use my GeoLoc object in the default-construction of their CleverObject. Isn’t there some sort of design principle regarding not guessing how people are going to use your objects?

On the other hand, Rob used to say that construction implies that an object is ready-for-use, and the more I think about it, the more I think that an unplaced GeoLoc object is of no use at all.

Maybe it’s a case of Immutable objects vs Mutable objects? Mutable objects like Lists and BoundingBox objects that expand and contract should have some sort of concept of ‘empty’, and be default-constructed as such. Immutable objects such as GeoLoc and Time objects that are fixed throughout all eternity should always have a valid value, and therefore must be constructed with such.

Anyway, I’ve completely lost my train of thought. Someone leave a comment and tell me what it is that I’m s’posed to be doing…

Advertisements
9 Comments leave one →
  1. Rob Allen permalink
    November 3, 2008 9:57 pm

    I think you’re right that all constructors should have some sensible meaning, and there’s no reason to make GeoLoc() == GeoLoc(0,0) when a named constant like GeoLoc.ORIGIN would communicate that fact more clearly. Having no default constructor is less convenient for clients, but it means that they really have to consider what ‘default’ means for them, too.

    I don’t necessarily think that the immutability of GeoLoc objects matters. For instance, Double.NaN isn’t a valid value for use in calculations and yet it is immutable.

    The mention of null objects is probably the key point: What would a null GeoLoc object do? I’m guessing that it has very little behaviour of its own and is just a nice readable wrapper for two numbers. So, if there were a null GeoLoc object, client code might have to detect the null value and handle it. Poor clients, with conditional exception raises all over the place. (Consider the ‘null’ keyword in Java. Don’t you sometimes wish you could say, “Never pass me a null object!” and make the compiler enforce it?) So, maybe it’s more convenient if clients can assume that all GeoLoc objects are valid.

    A question for you. 🙂 What does GeoLoc mean? Is it “Geodetic Location”, “Geographic Location” or something else?

    By the way, Double and Integer default construct to zero to maintain the similarity with the basic types. In turn, I assume they do it for convenience, since zero is a nice safe value (e.g. zero iteration loops are harmless, etc. just like empty Lists). In the case of Double, some hardware traps values like NaN or infinity, so initialising to those special values could cause an actual performance impact.

  2. November 4, 2008 11:38 am

    I was going for “Geodetic Location”, but only because that’s what it’s called in my current project, and not because it’s necessarily correct to call it that…

    As per usual, I’ll think about everything you’ve said, then come back with more stupid questions…

  3. November 4, 2008 6:07 pm

    I’m a fan of long object/variable/method names. It seems some people (c++ programmers 😉 ) have a tendency to name things as if characters are a finite resource that must be rationed. Anyway, I digress since I don’t care about that really… it would be like wasting my life arguing about where to put a curly brace – like I give a hoot. Life’s too short.

    Your GeoLoc object, as I understand it, is a “value object”. I believe you want a nice immutable object that will always be legal and can never change.
    Why would you want other people to jump through the “defensive programming” hoops just to make stable code that uses a value object? BigInteger is a good example. Another textbook example would be if I was to make an object to represent Money. I don’t want to “setPence” or any of that guff, I don’t even want to increment it, it doesn’t make sense. Sure you could have a add method to sum two Money objects but that should return a new Money object for the result.
    Similarly, for your GeoLoc object, you may offer transformation methods to translate the location or whatever, but they should just return new GeoLoc objects.

    I’ve kind of forgotten what I wanted to ramble about. My point is really that a default constructor should be for objects that have a sensible default. Date for example, defaults to now, pretty useful I’d say. Now that I think about it, a location of 0,0 is probably going to be the most commonly constructed location so why not make that the default. Yeah, actually, have a default constructor that constructs a location to 0,0. As long as your object is immutable, any consumers will realise they need to set it at construction if they want any other values. Having just re-scanned your post, I’m basically contradicting you and saying I would want GeoLoc(0,0) == GeoLoc()

  4. November 4, 2008 10:41 pm

    Matt; I think I’ll save replying to your character-rationing comment for another time… If there’s one thing I’m sure of, it’s that (0,0) is *not* a sensible value for a default-constructed GeodeticLocation – I’ve seen too many (stupid) people do stupid things off the coast of Africa for it to be a good idea. It’s bad enough that people write the constructors for GeoLoc objects with (lat, lon) arguments because it “sounds right”, and then you’re stuck with idiots getting their east/wests and north/souths the wrong way round again and again and again… But that aside, both you and Rob are right about the whole defensive programming/not having null objects thing. Oh, and your Money example is oddly connected to the Stupid Question that I’m saving for next week…

  5. November 5, 2008 10:38 am

    I think you’re right that all constructors should have some sensible meaning, and there’s no reason to make GeoLoc() == GeoLoc(0,0) when a named constant like GeoLoc.ORIGIN would communicate that fact more clearly.

    Damn. I wish I’d thought of that.

    Having no default constructor is less convenient for clients, but it means that they really have to consider what ‘default’ means for them, too.

    In work, I’ve “removed” (I’ll explain the quotes in later comment, or post, or conversation, or something…) the default constructor, and changed existing code to use the (lon, lat) constructor instead. In all cases, the code looked like:

    GeoLoc myGeoLoc = new GeoLoc();

    myGeoLoc.setLat(blahblahblah);

    myGeoLoc.setLon(yadayadayada);

    which, for the time being, I’ve changed to:

    double myLat = blahblahblah;

    double myLon = yadayadayada;

    GeoLoc myGeoLoc = new GeoLoc(myLon, myLat);

    which I actually prefer anyway, so I’d argue that not providing a default constructor would’ve been no less convenient, and would’ve improved their code.

    In fact, in 99% of places, blahblahblah and yadayadayada will be the addition or subtraction of two GeoLoc objects, or the scaling of a GeoLoc by some factor. What I really want the code to look like is something like this:

    GeoLoc myGeoLoc = GeoLoc.Subtract(geoL1, geoL2);

    and what I really really want is for the programmer who first found him-or-herself subtracting GeoLoc()s to go ahead and write the method required, rather than me having to mop up behind them…

    and what i really really really want is for the GeoLoc code to be stored in a reuseable library that’s readily available to all projects requiring the use of geodetic locations.

    I don’t necessarily think that the immutability of GeoLoc objects matters. For instance, Double.NaN isn’t a valid value for use in calculations and yet it is immutable.

    [I get the feeling that I’m about to say something really stupid here, but some of the most important things you’ve taught me have come from me saying something stupid, so] I find Double/double thing a bit confusing. Technically, Doubles are immutable, right, and if you assign a Double a different value, then the internal state doesn’t get changed, but a new Double object is referenced. I mean:

    Double myDouble = 5d; // myDouble references an object on the heap, with and internal state set to 5

    myDouble = 7d; // myDouble now references a different object on the heap, with an internal state set to 7. The original object with state 5 is now unreferenced, and liable for garbage collection(?)

    However, with doubles, they work the old-fashioned way, right?

    double mydbl = 5d; // mydbl (on the heap?) has its (thirty-two) bits set to represent 5

    mydbl = 7d; // mydbl still occupies the same thirty-two bits, but they’ve been toggled to represent 7

    I’m not sure what my point is, other than the fact that I really ought to get out Head First Java and RTFM. I guess my point is that even though Doubles are technically immutable, you can just assign values to them like an old-fashioned double, and the immutability is something of an implementation issue. However, they’re a special case. Maybe. This is something I need to think about some more, and probably not at 11pm.

    The mention of null objects is probably the key point: What would a null GeoLoc object do? I’m guessing that it has very little behaviour of its own and is just a nice readable wrapper for two numbers. So, if there were a null GeoLoc object, client code might have to detect the null value and handle it. Poor clients, with conditional exception raises all over the place. (Consider the ‘null’ keyword in Java. Don’t you sometimes wish you could say, “Never pass me a null object!” and make the compiler enforce it?) So, maybe it’s more convenient if clients can assume that all GeoLoc objects are valid.

    I agree.


    By the way, Double and Integer default construct to zero to maintain the similarity with the basic types. In turn, I assume they do it for convenience, since zero is a nice safe value (e.g. zero iteration loops are harmless, etc. just like empty Lists). In the case of Double, some hardware traps values like NaN or infinity, so initialising to those special values could cause an actual performance impact.

    Hmm, I never thought of that. I’m happy with Integer defaulting to zero, as it represents “none” or “nothing”, and so is akin to an empty state, just like an empty list. Now that I’ve starting thinking about it, I’m not so sure about Double defaulting to zero. And now that I think about it some more, Bartoz says that you should explicitly intialise variables anyway, even if you’re using the default value. So if it where up to me, you’d have to always have initialise ints and doubles, and default construction would be the reserve of things that actually have sensible “nothing” states, like lists and bounding boxes.

    So it’s probably for the best that it’s not up to me…

  6. dan permalink
    December 11, 2008 12:12 pm

    This seems to be a much more concise list of what I think I’m getting at…

    http://docs.codehaus.org/display/PICO/Good+Citizen

  7. June 8, 2009 10:37 pm

    double myLat = blahblahblah;
    double myLon = yadayadayada;
    GeoLoc myGeoLoc = new GeoLoc(myLon, myLat);

    Make Longitude and Latitude types, then there’s less chance of getting them the wrong way round. Perhaps derive from Double.
    public GeoLoc(Longitude longitude, Latitude latitude);

    GeoLoc myGeoLoc = GeoLoc.Subtract(geoL1, geoL2);

    How about:

    GeoLoc offset = geoL1.offsetFrom(geoL2);

  8. June 9, 2009 9:37 am

    I agree that the subtract method should be should be offered by the GeoLoc instance, rather than a static method on the GeoLoc class since that’s what java does so it must be right. It also reads in a more fluent style.

    Making Latitude and Longitude types is also a good idea. Java isn’t used statically enough imho. If anyone thinks doing that is a waste of time, then I would argue they haven’t written enough code talking to other people APIs that have lost their javadoc or been obfuscated in some way.
    Primitives Suck

  9. June 9, 2009 10:25 am

    Hi Pete,

    You’re right about offsetFrom(), and I agree that Lat and Lon should be classes, but I don’t think that they should derive from Double; I think that ties you too closely to the implementation of Double. There’s more detail in Double Trouble, but the bottom line is that I don’t think that “Lat is-a Double” is right (but “Lat has-a Double” might well be a first implementation).

    Again, it’s one of those examples where there are several overlapping issues with the code, and I’ve tried to just write about one of them (without pulling my finger out and writing all the accompanying articles). I’ve a draft in google docs entitled “Primitives Are Code Smell”, which obviously links in here. Although maybe Matt’s written that post for me…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: