True Scala complexity

Update 1: See also the discussion over at Hacker News.

Update 2: Sorry for the downtime. Leave it to the distributed systems guy to make his blog unavailable. Nginx saves the day.

It’s always frustrating reading rants about Scala because they never articulate the actual complexities in the core language.

Understandable—this post is intended fill that gap, and it wasn’t exactly easy to put together. But there’s been so much resistance to the very thought that the complexity exists at all, even from on up high, that I thought it would be constructive to provide a clearer illustration of why it’s real and how it manifests itself.

So, here goes yet another Scala complexity rant, from someone who labels himself as a Scala advocate. And since this tends to be an insanely touchy subject for some lonely people, please read every sentence as implicitly beginning with “In my opinion…” before you fire off those death threats.

I’ve been hacking in Scala for a while now, since 2006. This was a time when exceptions from the compiler were routine, the standard library was in poor shape, the website was clearly designed by busy academic PL researchers, and the community felt quite a bit lonelier. (I was a much bigger masochist back then.)

Before that I had come from Haskell, the Lisps, etc., so I wasn’t a stranger to functional programming or category theory or what have you. Yet of all the languages I’ve picked up, it’s Scala’s growth that has been the most fascinating to spectate. I watched as this little research language—which I’ve never had any illusions of seeing “go mainstream”—slowly clawed its way out of obscurity, facing seemingly unusual amounts of bashing at every turn, all while believers cried “FUD!” and Martin patiently pushed Scala forward.

I often see the following issues raised in those Scala bashings, and this list includes Typesafe’s top priorities. My two cents are thrown in:

  • Performance: Yes, I’ve been bitten by for loops and whatnot. Fine, I can’t have my cake and eat it too. For those to-the-metal hotspots, I have to drop down to ~Java-level verbosity. I can live with that. Perhaps soon, compiler and VM optimizations will let us eat our cake as well.
  • IDE support: There’s no shortage of glitches, but it’s shaping up at a steady pace and is already plenty helpful.
  • Learning curve: This can be steep depending on where you’re coming from, but the resources and community support are all fairly solid and improving. That said, this is the closest issue to the subject of this post.
  • Build times
  • Binary compatibility: As of 2.9, sanity and predictability has been introduced into the versioning scheme.
  • Frameworks
  • Community
  • SBT: “SBT is confusing as shit.” –Questioner at the recent Scala SF meetup (and plenty others, in different words). Amen, in so many ways. Thank heavens Typesafe has acknowledged this and is working on it.
  • Immature library ecosystem: Unsurprising for a relatively new language, but you do have the entire universe of Java libraries available.
  • WTF is /:? Why do I need to consult a periodic table to make HTTP requests? Those are unfortunate names/designs. Fortunately, it seems library authors, including the Scala team, are walking away with a healthy lesson learned, so we’ll probably see more accessible names in libraries in general.

Important issues to be sure, but these aren’t the ones keeping me up. They’ll take time and perspiration to fix, but hopefully with enough of both they’ll come.

What really bothers me is in the core language itself, something much harder to fix. The thing that Typesafe’s agenda does not touch on is the complexity of the language. If anything, Typesafe’s goals of driving Scala adoption, maintaining compatibility, etc. will likely slow down language evolution, and the less likely we are to see any fundamental course-corrections as a result.

What complexities am I talking about, exactly? This is one of those things that needs a “show, don’t tell” treatment. Complexity is a tricky thing to articulate, so no matter how many different ways I cut it, it’s far more effective to begin by just rubbing your face in it a bit.

So let’s take a little stroll through Scala programming land. And while we’re there, keep in mind that the problems encountered aren’t contrived cases I cooked up to make this argument; these are all (simplified versions of) situations we actually encounter while building real products.

Operating On Collections

(I’ve tried to make this as accessible as I could to non-Scala developers who perhaps have enough experience across languages to guess their way through the syntax, but you’re probably only going to really appreciate the frustrations if you have a passing familiarity with Scala basics. If you are a Scala developer, you should try to solve the problems we run into on your own first!)

Let’s make a super-simplified model of some basic collections we have in Scala, ignoring generics as a first step. We’ve got the primitive Array type from Java, and we have our nice Seq type from Scala. head is one of the many Scala collection methods, in Seq but not in Array. We’d like to pimp enrich Arrays to Seqs so that we can call methods like head on Arrays.

scala> class Array
defined class Array

scala> class Seq { def head = 0 } // dummy implementation
defined class Seq

scala> class WrappedArray(xs: Array) extends Seq // a Seq adapter for Arrays
defined class WrappedArray

This next function is an implicit conversion, so called because it’s automagically invoked whenever we have an Array and treat it like a WrappedArray (and thus a Seq). This auto-adaptation is called “enrichment.”

scala> implicit def array2seq(xs: Array): Seq = new WrappedArray(xs)
array2seq: (xs: Array)WrappedArray

Now we can do stuff like:

scala> new Array().head
res0: Int = 0

That’s basic stuff. Now we want to extend Seq with our own methods, such as ident (a dummy that’s just the identity function). We implicitly convert from Seq to an anonymous class containing our extension method:

scala> implicit def seq2richseq(xs: Seq) = new { def ident = xs }
seq2richseq: (xs: Seq)java.lang.Object{def ident: Seq}

So we can now do this:

scala> new Seq().ident
res1: Seq = Seq@de81d48

The following doesn’t work, since Array is not a subtype of Seq:

scala> new Array().ident
<console>:13: error: value ident is not a member of Array
              new Array().ident

The problem is that views don’t chain. Instead, use view bounds. In this [A <% Seq] notation, we accept any type A that can be implicitly converted to a Seq.

scala> implicit def seq2richseq[A <% Seq](xs: A) = new { def ident = xs }
seq2richseq: [A](xs: A)(implicit evidence$1: A => Seq)java.lang.Object{def ident: A}

scala> new Array().ident
res3: Array = Array@3e55a58f

But the collections hierarchy is, of course, generic.

scala> class Array[A]
defined class Array

scala> class Seq[A] { def head = 0 }
defined class Seq

scala> class WrappedArray[A](xs: Array[A]) extends Seq[A]
defined class WrappedArray

scala> implicit def array2seq[A](xs: Array[A]) = new WrappedArray[A](xs)
array2seq: [A](xs: Array[A])WrappedArray[A]

Again, we try to define our helper using view bounds as before:

scala> implicit def seq2richseq[A, I <% Seq[A]](xs: I) = new { def ident = xs }
seq2richseq: [A, I](xs: I)(implicit evidence$1: I => Seq[A])java.lang.Object{def ident: I}

Try it out:

scala> new Seq[Int]().head
res1: Int = 0

scala> new Array[Int]().head
res2: Int = 0

scala> new Seq[Int]().ident
<console>:13: error: No implicit view available from Seq[Int] => Seq[A].
              new Seq[Int]().ident

Nope. To move forward, we can use higher-kinded types (Function[I[_], ...]), and instead of view bounds (which would give us the unhelpful error: type I takes type parameters), we have to take the converter in as an “explicit” implicit parameter in a curried argument list such that the type inference can flow from the first argument list to the type parameter list to the second argument list:

scala> implicit def seq2richseq[A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A]) = new { def ident = xs }
seq2richseq: [A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A])java.lang.Object{def ident: I[A]}

scala> new Seq[Int]().ident
res3: Seq[Int] = Seq@417d26fc

scala> new Array[Int]().ident
res4: Array[Int] = Array@280bca

This fails as soon as we have a different shape, such as a binary type constructor:

scala> class DbResultSet[A, DbType]
defined class DbResultSet

scala> implicit def db2seq[A](db: DbResultSet[A,_]) = new Seq[A]
db2seq: [A](db: DbResultSet[A, _])Seq[A]

scala> new DbResultSet[Int, Nothing]().ident
<console>:17: error: value ident is not a member of DbResultSet[Int,Nothing]
              new DbResultSet[Int, Nothing]().ident

We need to introduce multiple overloads for every possible shape:

scala> implicit def seq2richseq2[A, I[_,_]](xs: I[A,_])(implicit f: I[A,_] => Seq[A]) = new { def ident = xs }
seq2richseq2: [A, I[_,_]](xs: I[A, _])(implicit f: I[A, _] => Seq[A])java.lang.Object{def ident: I[A, _]}

scala> new DbResultSet[Int, Nothing]().ident
res8: DbResultSet[Int, _] = DbResultSet@770fba26

And for each shape, every possible parameter position:

scala> class DbResultSet[DbType, A]
defined class DbResultSet

scala> implicit def db2seq[A](db: DbResultSet[_,A]) = new Seq[A]
db2seq: [A](db: DbResultSet[_, A])Seq[A]

scala> new DbResultSet[Nothing, Int]().ident
<console>:21: error: value ident is not a member of DbResultSet[Nothing,Int]
              new DbResultSet[Nothing, Int]().ident

scala> implicit def seq2richseq22[A, I[_,_]](xs: I[_,A])(implicit f: I[_,A] => Seq[A]) = new { def ident = xs }
seq2richseq22: [A, I[_,_]](xs: I[_, A])(implicit f: I[_, A] => Seq[A])java.lang.Object{def ident: I[_, A]}

scala> new DbResultSet[Nothing, Int]().ident
res10: DbResultSet[_, Int] = DbResultSet@51274873

Alternatively, you can go through the trouble of first establishing a unary type constructor for every concrete type that you want to use this implicit on:

scala> class DbResultSet[DbType, PoolType, A]
defined class DbResultSet

scala> class MakeDbResultSet[A] extends DbResultSet[Nothing, Nothing, A]
defined class MakeDbResultSet

scala> implicit def db2seq[A](db: DbResultSet[_,_,A]) = new Seq[A]
db2seq: [A](db: DbResultSet[_, _, A])Seq[A]

scala> new MakeDbResultSet[Int]().ident
res11: MakeDbResultSet[Int] = MakeDbResultSet@5e0e4436

ident was just a placeholder, of course—otherwise we wouldn’t need to constrain it to be a Seq. We actually want a real method here—say, runs, which returns the number of runs of contiguous groups of elements, where groups are defined by the given predicate for adjacent elements. E.g., Seq(1,1,1,2,3,3) runs (_==_) should return 3. Here, for simplicity, we’ll just always return 0.

scala> implicit def seq2richseq[A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A]) = new {
     |   def runs(f: (A,A) => Boolean) = 0
     | }
seq2richseq: [A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A])java.lang.Object{def runs(f: (A, A) => Boolean): Int}

scala> new Seq[Int]().runs(_ == _)
res12: Int = 0

scala> new Array[Int]().runs(_ == _)
res13: Int = 0

Now add isMajority(x), returning true if x is the majority element: in the Seq and false otherwise. (We’ll just always return false for simplicity.)

scala> implicit def seq2richseq[A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A]) = new {
     |   def runs(f: (A,A) => Boolean) = 0
     |   def isMajority(x: A) = false
     | }
<console>:27: error: Parameter type in structural refinement may not refer to an abstract type defined outside that refinement
         def isMajority(x: A) = false

Oops. By using/returning new { ... }, we were actually using refinement types, which won’t work because on the JVM, where we have type erasure: Scala generates a single generic implementation of the isMajority method, and it calls methods on refinement types via reflection, so it has to fill in something for the ? in seq2richseq(xs).getClass.getMethod("isMajority", Array(?)), and we don’t know what that is. Whereas for runs, we actually do know what it is—this time, because of type erasure, we know it’s just a (_,_) => _ or Function2[_,_,_]—that it’s a Function2[A,A,Boolean] doesn’t matter. Did you get that? Here’s a more long-winded explanation.

Furthermore, we’ve all along been imposing a significant performance penalty by using reflection.

The solution is simple: introduce some boilerplate by hoisting the code out into a named type.

scala> class RichSeq[A](xs: Seq[A]) {
     |   def runs(f: (A,A) => Boolean) = 0
     |   def isMajority(x: A) = false
     | }
defined class RichSeq

scala> implicit def seq2richseq[A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A]) = new RichSeq(xs)
seq2richseq: [A, I[_]](xs: I[A])(implicit f: I[A] => Seq[A])RichSeq[A]


scala> new Seq[Int]().isMajority(3)
res16: Boolean = false

scala> new Array[Int]().isMajority(3)
res17: Boolean = false

Let’s now fill out the collections hierarchy with a few more types:

scala> class CharSequence extends Seq[Char]
defined class CharSequence

scala> class String extends CharSequence
defined class String

This works fine:

scala> new CharSequence().isMajority('3')
res19: Boolean = false

scala> new CharSequence().isMajority(3) // Scala implicitly converts ints to chars
res20: Boolean = false

But this doesn’t!

scala> new String().isMajority('3')
<console>:33: error: value isMajority is not a member of String
              new String().isMajority('3')

What’s the issue?

Turns out that Scala will search up to one level up the type hierarchy for a matching shape for the implicit.

Let’s try explicitly allowing subtypes:

scala> implicit def subseq2richseq[A, I <: Seq[A]](xs: I) = new RichSeq(xs)
subseq2richseq: [A, I <: Seq[A]](xs: I)RichSeq[A]

scala> new CharSequence().isMajority('3')
res22: Boolean = false

scala> new String().isMajority('3')
<console>:32: error: value isMajority is not a member of String
              new String().isMajority('3')

No luck! Can you figure this out?

Generalized type constraints to the rescue! We can use the nearly undocumented (and impossible to Google for) <:< as an implicit evidence parameter:

scala> implicit def subseq2richseq[A, I](xs: I)(implicit evidence: I <:< Seq[A]) = new RichSeq(xs)
subseq2richseq: [A, I](xs: I)(implicit evidence: <:<[I,Seq[A]])RichSeq[A]

scala> new CharSequence().isMajority('3')
res24: Boolean = false

scala> new CharSequence().isMajority(3)
res25: Boolean = false

scala> new String().isMajority('3')
res26: Boolean = false

scala> new String().isMajority(3)
<console>:36: error: Cannot prove that String <:< Seq[Int].
              new String().isMajority(3)

The fourth case still fails! But let’s say we’re willing to live with this. Moving on.

Reality is more complex, since CharSequence and String are not subtypes of Seq[Char]. They come with implicit conversions to Seq[Char] types, though:

scala> class String
defined class String

scala> implicit def str2seq(xs: String) = new Seq[Char]
str2seq: (xs: String)Seq[Char]

Armed with this, along with our original view-bound implicit conversion, will enrichment work on Strings?

scala> new String().isMajority('3')
<console>:36: error: Cannot prove that String <:< Seq[Char].
              new String().isMajority('3')

Not our day. Scala’s giving errors about the wrong implicit conversion.

Yet everything’s dandy if we explicitly help Scala make it over the first conversion:

scala> str2seq(new String()).isMajority('3')
res30: Boolean = false

What’s the problem here? String is neither a subtype of Seq, nor is it a unary type constructor. Back to the drawing board. We can add a different implicit from proper (first-order) types:

scala> implicit def seq2richseq0[A, I](xs: I)(implicit f: I => Seq[A]) = new RichSeq(xs)
seq2richseq0: [A, I](xs: I)(implicit f: I => Seq[A])RichSeq[A]

scala> new String().isMajority('3')
<console>:37: error: type mismatch;
 found   : String
 required: ?{val isMajority(x$1: ?>: Char(\'3\') <: Any): ?}
Note that implicit conversions are not applicable because they are ambiguous:
 both method subseq2richseq in object $iw of type [A, I](xs: I)(implicit evidence: <:<[I,Seq[A]])RichSeq[A]
 and method seq2richseq0 in object $iw of type [A, I](xs: I)(implicit f: I => Seq[A])RichSeq[A]
 are possible conversion functions from String to ?{val isMajority(x$1: ?>: Char(\'3\') <: Any): ?}
              new String().isMajority('3')

Dah, subseq2richseq is now further hindering us. For now let’s remove it, knowing that its removal prevents us from converting subtypes of Seq of arbitrary shape. Resetting our REPL to re-define everything except for subseq2richseq, we can make things work:

scala> implicit def seq2richseq0[A, I](xs: I)(implicit f: I => Seq[A]) = new RichSeq(xs)
seq2richseq0: [A, I](xs: I)(implicit f: I => Seq[A])RichSeq[A]

scala> new String().isMajority('3')
res1: Boolean = false

Alas, we broke this:

scala> new String().runs(_ == _)
<console>:17: error: No implicit view available from String => Seq[A].
              new String().runs(_ == _)

What was happening earlier was that we were giving away the type of A based on the rest of the expression after the implicit conversion—isMajority(Char) constrains A to be a Char. runs has no such luxury. So while the inferencer is unable to peek into the second argument list or the type bounds, it does peer beyond the entire implicit conversion at the subsequently invoked method. Anyway, you’ll just have to explicitly convert.

Now things get interesting.

So far we’ve been working in our tiny parallel universe, working on a cartoon of the collections library. We want to bring things into real Scala now, rather than reinvent the entire collections library. Let’s say we want to add a filterMap method: given a bunch of A and a function A => Option[B], return a collection of B.

To get a taste of how a proper collection method behaves, consider map, which is similar to filterMap. It’s defined in the base trait TraversableLike and has the signature:

trait TraversableLike[+A, +Repr] extends ... {
  def map[B, That](f: A => B)(implicit bf: generic.CanBuildFrom[Repr,B,That]): That

Watch this in action:

scala> List(1,2,3) map (_+1)
res0: List[Int] = List(2, 3, 4)

scala> (Vector(1,2,3): Iterable[Int]) map (_+1) // get the right dynamic type too
res1: scala.collection.immutable.Iterable[Int] = Vector(2, 3, 4)

scala> BitSet(1,2,3) map (_+1)
res2: scala.collection.immutable.BitSet = BitSet(2, 3, 4)

scala> BitSet(1,2,3) map (_.toFloat)
res3: scala.collection.immutable.Set[Float] = Set(1.0, 2.0, 3.0)

scala> Array(1,2,3) map (_+1)
res4: Array[Int] = Array(2, 3, 4)

scala> "abc" map (_.toUpper)
res5: String = ABC

We always return the most specific type we can. This works because of the functional dependencies in collections, CanBuildFrom[From, Elem, To]. For those new to Scala, you can learn more about how collections work from the collections architecture docs or from the innumerable Stack Overflow questions on the topic, but in short: CanBuildFrom[From, Elem, To] are these global helper objects that are passed as implicit parameters into collections method such as map and filter to allow them to construct a collection of the same (or most similar) type.

Let’s try to enrich collections with filterMap.

Which of the following is (are) the right signature(s) for the implicit conversion and for the filterMap method, if any? If you’re looking for easy eliminations, sorry: I have seen the equivalents to nearly all of the following lines on Stack Overflow, the mailing list, IRC/pastebins, or IRL. Many of these can actually get you somewhere, while not fully solving the problem as we defined it; hence, unfortunately, we see CanBuildFrom misused in many cases. So if you’re confused, you’re most certainly not alone.

implicit def conv[A, C <: GenTraversableOnce[A]](xs: C): Rich
implicit def conv[A, C <% GenTraversableLike[A,C]](xs: C): Rich
implicit def conv[A, C](xs: C)(implicit ev: C <:< GenTraversableLike[A,C]): Rich
implicit def conv[A, C](xs: C)(implicit ev: C <%< GenTraversableOnce[A]): Rich
implicit def conv[A, C](xs: C)(implicit ev: C => GenTraversableOnce[A]): Rich
implicit def conv[A, C[A] <: GenTraversableLike[A,C[A]]](xs: C): Rich
implicit def conv[A, C[A] <% GenTraversableLike[A,C[A]]](xs: C): Rich
implicit def conv[A, C[A]](xs: C)(xs: C[A] => GenTraversableOnce[A]): Rich
implicit def conv[A, C[_]](xs: C)(xs: C[A] => GenTraversableLike[A,C[A]]): Rich
implicit def conv[A, C](xs: C with GenTraversableLike[A,C]): Rich
implicit def conv[A, C](xs: C with GenTraversableOnce[A])(implicit ev: C => GenTraversableOnce[A]): Rich

In the following, ? is whatever is used above, either C or C[A].

def filterMap[B,D](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D]): D
def filterMap[B,D <: GenTraversableOnce[B]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D]): D
def filterMap[B,D <% GenTraversableOnce[B]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D]): D
def filterMap[B,D[B]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]]): D[B]
def filterMap[B,D[B] <: GenTraversableOnce[B]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]]): D[B]
def filterMap[B,D[B] <% GenTraversableOnce[B]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]]): D[B]
def filterMap[B,D[_]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]]): D[B]
def filterMap[B,D[_]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]], ev: D[B] <:< GenTraversableOnce[B]): D[B]
def filterMap[B,D[_]](f: A => Option[B])(implicit b: CanBuildFrom[?,B,D[B]], ev: D[B] => GenTraversableOnce[B]): D[B]

The only way for you to have any hope of navigating these waters is to really understand how Scala’s type solver and implicit resolution algorithms work. And not just a little bit. You have to know these well enough to know what their limitations are. Otherwise, you’re going to be tossing things at the wall and seeing what sticks first, which can be infuriating and still wrong.

This might be somewhat alleviated if it weren’t for the fact that Scala’s type inference is intentionally not spec’d, or at least woefully underdocumented save what little you can glean from section 6.26.4 of the language spec. You can turn to all the books or web resources in search of definitive sources of information, but you won’t find it; the best you can do is to scavenge nuggets of insight here and there on Stack Overflow or the mailing lists and try to piece things together into a more coherent picture.

The answer to our original question? It turns out none of these are correct. In fact, it is impossible to insert a new method that behaves like a normal collection method. This, despite the heavy advertising of enrich my library. (Update: after others took various stabs at this problem, Miles Sabin got a solution that I believe comes the closest; it still requires per-source-type/shape boilerplate, but is able to factor out the filterMap definition. Why does the per-source-type/shape boilerplate make a difference? Because whenever you introduce a new type that can be enriched into a Seq or introduce a new enrichment for Seqs (say, across different libraries), you end up having to manually augment every such possible pair with the proper boilerplate. Anyway, this isn’t the point of this post, but I thought it was worth highlighting that snippet for the Scala users reading this.)

OK, what have we accomplished? We’ve kinda-sorta added a few extension methods to some collections types. We’ve also managed to stroll right through level L3, officially the most advanced tier of Scala language mastery. We’ve been wrangling implicit parameters, type bounds, view bounds, implicit conversions, higher-kinded types, functional dependencies, refinement types, and generalized type constraints, to name a few concepts. This is all just a taste; there are plenty more concepts—and more importantly, interactions among concepts—in Scala for us to bump into. Yet at the end of the day, our method still fails to behave like a normal collection method. But that isn’t the point.

How does this play out in the real world? Most people probably simply don’t bother with much of the above, just going with something like:

def filterMap[A,B](xs: Seq[A], f: A => Option[B]) =
  for (x <- xs map f; y <- x) yield y

filterMap(List(1,2,3), (Some(_:Int))).asInstanceOf[List[Int]]
filterMap("hello", (Some(_:Char))).mkString

and being done with it. I’m betting this is closer to how Scala is predominantly used today, beyond the community of type-level metaprogramming junkies: most developers manage to avoid these questions altogether, giving in to clumsy code, opting for simplicity and getting things done, while remaining fully aware of the fact that their code is not as Scala-ish or as generic or as implicit or as type-safe as they ideally would like, all due to the very real complexity barrier.


Pop quiz time! Here’s a little sampler of “real-life” Scala encounters that are, unfortunately, brain teasers.

One of the following lines of code fails to compile with the following build errors:

error: missing parameter type for expanded function ((x$1) => x$1.unary_$minus)
error: diverging implicit expansion for type scala.math.Ordering[B] starting with method Tuple9 in object Ordering

Which one? Why?

Set(1,2,3).toIndexedSeq sortBy (-_)
Set(1,2,3).toSeq sortBy (-_)
val xs = Set(1,2,3).toIndexedSeq; xs sortBy (-_)
val xs = Set(1,2,3).toSeq; xs sortBy (-_)

Which one of the following lines doesn’t work? Why?

def add(x: Int, y: Int) = x + y
val f = add(1,_)
val g = add(1,_:Int)
val h = add(_,_)
val i = add(_:Int,_:Int)

Which two lines of code below do not compile? Why?

def sample[A](pos: => A, neg: => A) = if (util.Random.nextBoolean) pos else neg
sample(1, 42)
sample("one", 42)

def sample[A](pos: => A)(neg: => A) = if (util.Random.nextBoolean) pos else neg

def sample[A](pos: => A) = new {
  def apply[B >: A](neg: => B) =
    if (util.Random.nextBoolean) pos else neg

def sample[A, B >: A](pos: => A)(neg: => B) = if (util.Random.nextBoolean) pos else neg

Which of the following doesn’t work? Why?

def getSize(xs: Set[Any]) = xs.size
val xs = Set(1,2,3); getSize(xs)

def getLen(xs: Seq[Any]) = xs.length
val xs = Seq(1,2,3); getLen(xs)
val xs = Array(1,2,3); getLen(xs)

def getLength(xs: Array[Any]) = xs.length
val xs = Array(1,2,3); getLength(xs)

Why are types explicitly required in the following?

abstract class Widget
class Button extends Widget

abstract class Container[A <: Widget]
class ButtonContainer extends Container[Button]

class Sink[A <: Container[B], B <: Widget](x: A)

// this doesn't work:
new Sink(new ButtonContainer)
// have to do it like this:
new Sink[ButtonContainer, Button](new ButtonContainer)

On Acknowledging Problems

This isn’t even venturing into the more “fringe” areas of Scala; there are many darker corners of features like specialized types (see specialized types vs. subtyping), path-dependent types (see Bakery of Doom for cake-patterned dependency injection), Java wildcards-Scala existentials interop, extractor object and partial function asymptotic complexity, delimited continuations vs. type inference, traits vs. eager vals Shamanism, and much, much more. The ones I highlighted are just a sample of the everyday encounters if you’re neck-deep in Scala. And the thing to realize is that everything else is built upon—and more importantly, expressed using—these very foundations that comprise the core language.

Martin acknowledges many of the problems with Scala, and those are the ones Typesafe is actively addressing. But the one word he doesn’t seem to like using to describe Scala is “complex.” On the contrary, one might take him to sound downright dismissive:

Scala is used in a large and growing number of enterprises, some of them with more than 100 devs working with it. These people just get on with the job (and love it for the most part); they don’t find Scala’s “complexity” too daunting.

Martin’s right. Plenty of devs—my team included—are using Scala and getting on with it. We do like many things about it. We certainly don’t find it too daunting to be productive with. But that’s not to say it’s without complexity or frustration. There are even larger teams being productive with C++ as well, doing amazing things with it on which Scala only wishes it were brought to bear…but enough of the world has already made my point for me on C++.

There also appears to be a variety of other ways of deflecting this criticism:

  • Chalking the complexity up to the excessive use of symbolic operators or type-level programming in certain community libraries.
  • Establishing multiple tiers of Scala usage, under the pretense that one can operate in isolation within any particular tier without being exposed to concepts from the upper tiers.
  • Reminding us that Scala at its essence consists of just a few powerful orthogonal features; in this view it’s simpler than more ad-hoc designs such as Java’s. Typical response when confronted with the complexity question:

    By a lot of objective measures, Scala does not have a lot of features. For instance, it has fewer key words than many other languages including C#, C++, and even Java. It has its grammar size, so if you look at the size of the context free grammar, that also is about the same size as Java 5, slightly smaller and certainly much, much smaller than a grammar like C#, C++, or things like that.

  • Assuming that what developers object to is foremost the non-trivial set of foreign concepts:

    It’s true that there’s a lot in the language, in particular a lot of very fundamental concepts, which take a lot of time for some people to learn and to absorb, things like traits, something that is new in Scala, pattern matching, the whole functional aspects that you have of closures and functions as parameters is new for many people and so on.

To be fair, the criticism isn’t exactly focused; I’ve heard complaints as far-ranging as, “Oh my god the constructor arguments are next to the class name. Shit is way too complex.” And our BDFL is right on many counts; he doesn’t lie when he asserts that at its core Scala is a language with a smaller number of orthogonal features than found in many other languages. As presented in the books and tutorials on Scala, everything seems manageable. However, the problem is that each feature has substantial depth, intersecting in numerous ways that are riddled with nuances, limitations, exceptions and rules to remember. It doesn’t take long to bump into these edges, of which there are many. These are not issues with Scalaz, and these are not things you can shield yourself from by constraining yourself to the L1/A1 expertise tier. I, along with every other developer I watched pick up Scala, became intimately aware of the issues right off the bat. There’s a reason why a whole cottage industry of “better Java” languages has emerged—languages that deliberately forego the sophisticated levels of expressiveness captured by Scala.

I also don’t buy that the individual concepts in Scala are what’s stumping smart programmers. Traits, pattern matching, closures, higher kinds, typeclasses, functors, monads, etc. might induce a paradigm shift and require some digestion time if you’ve only ever learned Java, but these aren’t inherently over-complex. I’m saying Scala—that particular recipe for mixing all these concepts—is complex. Haskell, for instance, has a rich ML-style type system and shares a great deal of those concepts in common with Scala, but once you’re in it, Haskell is significantly simpler to use, whereas Scala has many more concerns to juggle: object orientation/subtyping, implicits, Java interop, and much more.

And what do I mean by “complex,” anyway? Complexity is just a measure; isn’t a threshold always relative? Look: at the end of the day, features like an expressive type system are there to make us more productive, whether it’s aiding us in the construction of correct programs, or allowing us to concisely express our abstractions, or keeping our programs maintainable over time and across developers. But we have only so much patience for dicking around with types, since the returns diminish beyond a certain point. So, by Scala complexity I mean the degree to which the features in Scala are intricately intertwined, laden with surprising rules and exceptions. And when I say Scala is complex, that’s short for saying that I believe idiomatic Scala approaches “too much complexity to bother with” status for the majority of (competent and capable) developers out there; it has certainly already crossed that line for others, who have chosen to not use the language.

During a lunch I had with Guido van Rossum a while back, he explained to me something very important—that “language design is not just about solving puzzles.” Think what you will of Python (I can already sense all the statically typed FP die-hards turning their noses up at the mere mention), but the assertion is undoubtedly true, and it’s what makes language design hard. It’s not just about figuring out ways to fit in as much functionality as possible or finding the smallest number of primitives that maximize expressive power; language design is just as much a usability problem. The feeling I get from Scala is that such concerns take a back seat to squeezing as much as possible out of the feature set and its interactions. These two objectives can be made to align, but that’s hardly a given.

I’m not exactly sure where to go from here. Some of the problems I highlighted are tractable and will hopefully be improved in future versions of Scala, with either compiler improvements or updated language design choices. But other problems are simply hard problems, the solutions to which are unclear, at least not without giving up the very same powerful features that make Scala so malleable and attractive. For instance, one can’t simply introduce a combinatorial explosion into the search space for the type solver/implicit resolution algorithms; Scala builds take long enough already!

What is certain is that Scala complexity is quite real. While I wouldn’t doubt that the smart people comprising Typesafe are fully aware of that, they certainly don’t show it or at least any concern over it. Acknowledging problems matters: it assures the world that you understand the problems, and opens the path (and the communication) toward solutions.

Beyond Typesafe, the denial pervades the Scala community. There’s no shortage of broken analogies and shallow arguments dismissing the complexity concerns altogether or arguing that it doesn’t matter. What’s worse, something I see far too often from the community is the insistence that anyone proclaiming Scala complexity is just incompetent or spreading FUD. All this behavior does the community a disservice; it’s a sure way to alienate talent and to prevent problems from being fixed. We need to be able to open our eyes wide enough to critique the status quo in order for progress to be made.

Does all this mean I’ve given up on Scala? Hardly. No tool is perfect, and any engineer worth her salt should have a clear understanding of both the pros and the cons of the tools at her disposal. If you’ve been using a language for a while and can’t rant about it, that’s probably a bad sign.

Notice how several of the problems I pinpoint above are of the form, “How do I accomplish this expression that isn’t even possible in many other languages?” And these are mostly static build issues, which are far from the worst fates imaginable. Certainly, if you don’t care at all about clumsy code or resorting to escape hatches or “writing Java in Scala,” it’s frequently possible to bang out mostly-type-safe Scala while side-stepping battles against the compiler straitjacket. Plus, let’s not lose sight of all the things Scala brings to the table, none of which I’ve described. If there’s one thing more time-consuming than illustrating the complexities of the Scala language, it’s illustrating all the advantages, and I’ve already spent more time on this post than I’d care to admit.

Bring on the flames.

Special thanks to Daniel Spiewak, Alex Payne, and David Pollak for reviewing drafts.

Follow me on Twitter for stuff far more interesting than what I blog.