Monday, June 22, 2009

Avoid having your Scala code turning into APL

With the rise of languages that support operator overloading like Ruby, Groovy, Scala and C# one is justified to wonder if libraries will become loaded with unreadable APL like code.
I have already seen signs of the potential chaos that could come from the abuse of operator overloading. In the Scala actor library tutorial for example I have seen the following:

producer ! Next

I was initially unable to guess the meaning of this.

The following example is from a Ruby library:

(aobject / 'a string')

In this case because I watched the full presentation I know that the slash is actually an alias for a search method.
The speaker in the presentation was calling this cool. I call it stupid. It is the archetypal example of a bad use of operator overloading. The presenter himself said he was puzzled and could not understand what the code was doing at first. Definitely not cool.
Since I'm looking at switching to Scala as a main language I thought I needed to think about what kind of rule I would put into our code convention document under the section
Operator Overloading. I thought I would share this with others to get inputs and hopefully constructive comments.

For me, the best applications of operator overloading makes the code easier to understand.
Here are some example of this:

Mathematical operations (+, -, *, /)

This is the best application for operator overloading and as long as you don't
start doing stupid things like using operators in a way that conflicts with established
conventions you should be OK.

Logical operations (&. |, ||, ..)

This should be used on boolean values. One acceptable extension of this is to
use them in cases where we have implicit conversion to boolean. They don't really look like their textbook equivalent but they have been in use long enough in enough languages to be used safely.

Comparisons (>, <, ..)

Use those on any set of ordered elements. The meaning should be obvious. For example:
myWeight > aWeight
Beware of things like:
myDog > otherDog
where we don't exactly know how things are being compared.

Operations that are metaphorically related to a mathematical or logical equivalent

Using the + when you want to add a string to another for example.

Operators used as part of method names

Of course this is not operator overloading but since it is another use of operators that can lead to abuse and unreadable code I include it here. In those language that allow this, using ? as a suffix for queries for example is OK. The meaning is clear and makes the query stand out. In this category I think that % and $ could be used if allowed by the language. The only other case that I can think of is the exclamation point as a warning that a method call might have side effects. This last one is not as obvious and is at the limit of what is acceptable for me.

Operators that are already used in the core libraries of a language

If those operators have been around long enough and it is too late to remove them from the standard library then we have no choice but to use them.

All other uses of operator overloading is suspicious. The worst offenders of course are operators used as meaningless abbreviations for method names.

In some cases you will have to watch for compiler quirks and language peculiarities. WIth C# for example when you define the ++ operator on a class it has the same semantic when used as a prefix (pre increment) or
suffix (post increment). In both cases this works like a pre increment operation. This is a bug factory.
In this case I think the compiler should give an error when ++ is used as a post increment operation because it will not have the expected result. You get the same thing with the -- operator.


  1. Yeah, I agree with your general reservations about going crazy with operator "overloading" in Scala, but I'm slightly less conservative. I think your suggested use cases are the 95% sane ones, and all else should be thought about extremely carefully.

    ! is used as the "message send" operator in Erlang, so I believe it's a case of reusing an existing notation. Though perhaps "producer sendMessage next" would be better?

    The really interesting example for me, however, is the Scala library for parser combinators. This is a zoo of symbols like ^^ ~ ~> <~ ? + etc that you can use to write grammars. The code does end up looking complex and hard to read, but I actually think it would be even harder to understand without operator overloading. That is, it's a trade-off between requiring someone to learn the symbols for a DSL, after which the code is fairly readable, versus code for which anyone can understand the syntax, but it's more difficult to work out what's actually going on in the grammar.

  2. Matt, I agree that there might be other special cases. I was tempted initially to have an "Other" category. However, it was not obvious to define rules for this category so I abandoned the idea. I agree that in the case of combinator parsing the choice of using symbols was probably the right one. This is a case where learning a special notation is probably the best approach. It is not obvious though to define a general rule in a code convention document for this.
    However, in the case of using ! as a abbreviation for send message I think they should have opted for something like sendMessage. Erlang in not a mainstream language and many people will get their first exposure to Actors in the context of learning Scala. Using a method name would have made things more readable.
    Thanks for the comments.