(I'm drafting this at the moment. Feedback appreciated - tom@tommorris.org )

Thank you Hacker News and Reddit.

A lot of guides to Scala are written with an intended audience of academics, computer scientists and the sort of people who really enjoy reading about programming language theory and spend their time reading Lambda the Ultimate etc., partly because Scala's development by Martin Odersky at EPFL means it's quite an interesting language on a purely intellectual level. (I don't mean to slight those people - I enjoy reading PLT stuff too.)

Scala isn't simply a research project though: it's designed as a practical language that you can use in the same way you can use Java or Ruby. Here it fits a rather unique niche: it is supposed to be useful both for programmers who usually work in dynamic or 'scripting' languages (however one defines them) like Python, Ruby or Perl and for people more familiar with languages like Java, C# and C++. It really has the benefits of both. As a language, it has a strong type system, but you don't have to make explicit type declarations - instead, the compiler or interpreter will be happy to work out that in `var foo = 9` a different type is being instantiated than in `var foo = "9"`. The language also imports a lot of the constructs of functional programming languages like Haskell into a language that's strongly OO.

But is it a language for hackers? By hackers, I don't mean kernel hackers. I mean people who build small hacks - who use whatever technology is at hand to achieve some interesting end-goal. Hackers need to be able to do the really easy things quickly. There, familiarity and speed beat correctness - worse is better because better can't be done quickly enough. BarCamps and HackDays require one to be able to turn around an idea overnight: a valuable skill in the commercial world also. Being able to rapidly prototype is a useful skill: it means you can "throw one away" quicker, you can drop ideas that are impractical and move on, and it means you can actually show something the next day and make people say 'Wow'. Odersky certainly thinks so: it's called Scala because it is claimed to be a "scalable language" - as good for building large concurrent systems as small hacks and scripts.

I think that Scala could be a perfect language for a lot of people who do lots of small-scale hacks at things like HackDays. I'm not totally sure of it, and a lot of people I speak to are sceptical. Unfortunately, most of the current Scala introductions spend their time talking about the intricacies of functional programming, and show you how to construct quicksorts, combinatorial parsing routines and other interesting but theory-driven tasks. Far more useful to show how one would talk to parse an RSS feed, get data from Twitter, build a web service and so on.

Scala guys may respond: "ah, but you just use the underlying Java". Which - for the PHP, Python, Ruby and Perl crowd who try hard to avoid Java - the answer is "okay, what's that?" Scala for Hackers is intended to solve this by pointing to the least-disempowering way of doing common practical tasks in Scala. Not necessarily the most powerful. Sometimes I will favour Java libraries rather than native Scala libraries because the Scala libraries seem to be designed for people with four-digit IQs. Rather a Java library you can understand in ten minutes than a Scala library that is still puzzling after reading the source code repeatedly...

As I said, I'm not sure of whether Scala will be a good language for programming in the small. Writing this document is a way for me to try and document the various things I'm doing with Scala. Hopefully Scala nerds might be able to use it to write libraries to make this stuff easier.

Theory

Obviously, you need to learn the basics of Scala before hacking with it. If you learn from dead-tree media, I suggest Odersky, Spoon and Venner's "Programming in Scala" (Artima) or Wampler and Payne's "Programming Scala" (O'Reilly). Odersky et al. is slightly more theoretical while Wampler and Payne is a bit more practical.

The other books that are available on Scala are David Pollak's "Beginning Scala" (Apress). This is a much gentler introduction than either Odersky et al. or Wampler and Payne.

Variables

When declaring variables in Scala, you can either declare them as var or val. vars are reassignable, vals aren't.

What does that mean in practice?

val foo = 5

foo = 6

<console>:2: error: reassignment to val

You can reinitialize the variable though:

val foo = 5

val foo = 6

The difference between the two is important, but not directly relevant here. The Odersky book and other documentation expounds on the relevance of the difference, and the benefits you can get if you use val instead of var.

If you are coming from Java, val is basically final.

The easiest way to think about it is that var is a variable like you may be used to from PHP or Python or Ruby or wherever. val is slightly different - it's a name you are giving as a shortcut for a value. It's not really a variable. But that's okay, because you often don't actually need your variables to be variable.

If you are new to the sort of style of programming Scala lets you do, the difference between var and val may seem very strange. You may have to trust me when I say it does have a purpose, but the importance of the var/val distinction isn't immediately relevant to you getting going with Scala.

Companion Objects

If you have used Ruby, you probably know the difference between an instance method and a class method - you define the latter by using 'def self.whatever' while the former uses just 'def whatever'. The class method is simply a method that can be used without an instantiating object. It is useful as you can use it for alternative constructors and the like. In Java, this is called a static method and you create it by prefixing the method definition with the keyword "static". In Python you use the @classmethod annotation.

In Scala, you can create static methods (and static properties) with companion objects. The companion object holds the statics while the class holds the instantiable methods/properties.

A companion object should be written in the same file (technically, the same compilation unit) as the class. Basically, put the companion object directly after the class in the file.

Like this:

class Book (val name: String)

object Book { /* whatever */ }

Inside the object, just add your static methods like you would to any other object. Unlike Java, you don't need to prefix static methods with "static".

Compile the class and companion object together.

When reading Scaladocs (like Databinder's), you need to keep the companion object distinction in mind. The reason you don't see statics in Scaladocs is because statics aren't anything special - they are just companion objects.

Introspection

If you use Ruby, you often want to see what methods exist on a class or object, so you do something like `"foo".methods`

In Scala, you can call 'getMethods' on a class, meaning you can do something like this:

"foo".getClass.getMethods.foreach(println _)

Here, the string "foo" is getting instantiated - Scala will figure out it is a string, then ask the instance to return the class. getMethods returns an array of the methods from the class. foreach loops across the array and (println _) prints 'em out.

But what if you don't want to instantiate the class to introspect on it's methods?

classOf[String].getMethods.foreach(println _)

getMethods returns an array of java.lang.reflect.Method objects (as opposed to Ruby which just returns an array of strings). The Method class is useful as you can use it to see what methods return what types, or take particular types as arguments. getReturnType() returns a Class object for

Here is an example of reflection I used when writing the XML parsing section:

(doc\\"h1")(0).getClass.getMethods.filter(_.getReturnType() == Class.forName("java.lang.String")).foreach(println _)

This takes the first h1 element in the document 'doc', gets all the methods one can call on that class, filters them to only those whose return strings and print those out. This way, I can figure out how out the 'get back a text node as a string' method - it's just called 'text'.

2.8.0 note: the REPL in the forthcoming 2.8.0 release has tab-complete for existing variables. You can't do "foo".[tab], but if you do var foo = "foo", you can then do foo.[tab] for bash-style tab completion for methods.

See the StackOverflow question Scala: How do I dynamically instantiate an object and invoke a method using reflection?

Duck typing is now structural typing

If you are coming from Ruby, Python or other dynamic languages, you may be used to duck typing - this is basically using the methods defined on the object as your type system rather than types being declared Java/C style. This is done by following existing method naming practice. So, in Ruby, if the object can be serialised as a string, you use "to_s". If it can be turned into XML, "to_xml", if you can iterate over it, you include the enumerable methods like "each". If you want to know if something is enumerable, you check it by doing "foo.responds_to?(:each)". This lets you write more flexible code - your code simply needs to check whether it takes whatever the method is that returns the result you are interested in, rather than you being too worried about what class it is.

In Scala, you may think, it's an old-fashioned static typed language like Java so you've gotta declare classes or - worse - interfaces. How boring. No sexy duck typing. Or, worse, you've got to specify Any (Scala's equivalent of Java's Object) as your type parameter - because you want to let it take anything, then just put in your documentation exactly what sort of object you need to pass it. This sucks.

Fortunately, you can do the equivalent of duck typing in Scala using structural typing. A structural type is a type defined by a set of what the specification calls 'refinements'. The refinements can be either methods or properties. A structural type can be declared using the type keyword or inline. An example:

def postToBlog(content: { def toHtml(): HtmlFragment }) = ...

This method takes as the first argument any object which matches the method's signature: it has to to be called 'toHtml' and return an (imaginary) 'HtmlFragment' object instance. The structural type is a value that you can assign a global name using the type method:

type Bloggable = { def toHtml(): HtmlFragment }

You can now use this as a type in the same way you can use classes as types (and ought to treat them syntactically in the same way you would classes IMHO - no horrible Hungarian notation like TBloggable or TypeBloggable or whatever - if you have a class and a type with the same name, that would seem to be a design problem):

def postToBlog(content: Bloggable)

Concurrency

There are a bunch of different ways you can do concurrency in Scala. Because Scala is Better Java, you can use java Threads just like you would in Java. You just write your thread classes in Scala instead. Scala doesn't have a synchronized keyword though - but that doesn't matter. synchronized is a method instead. If you are happy doing concurrency in the old-school way, go ahead.

Alternatively, you can use actors. For those coming from Ruby or Python, think EventMachine, Twisted, node.js etc. Scala's actors are similar to Erlang's actors. Message passing, no shared state. It's pretty lightweight and easy to do. There's about a gazillion tutorials out there. Type some combination of 'scala', 'actors' and 'tutorial' into Google and you'll bump into them. Also, the big Scala book by Odersky et al. has a tutorial.

There's more though. Akka extends actors to add software transactional memory (like Clojure!), events and a whole bunch of other coolness. Apparently. I haven't tried it, but I hear it is cool.

I'm a big fan of HawtDispatch. It isn't only for Scala - if you can tolerate verbosity, you can use it in Java. It implements Grand Central Dispatch on the JVM. Grand Central Dispatch is Apple's way of doing concurrency in C, C++ and Objective-C. It adds blocks to those languages which execute asynchronously on a system-wide thread pool. HawtDispatch does similarly for Java and Scala. The Scala API is 2.8 only, but you can use the Java API from 2.7.

I've found that a mixture of HawtDispatch and actors give you most of the concurrency you need when hacking something together.

Testing

I use Specs - it is BDD-style testing, very much like RSpec.

I need to look into Ostrich for performance testing.

There are alternatives to Specs, but I have found Specs to be the best thing to use so far.

Concurrency testing

Twitter has created xrayspecs which adds extra magic to Specs - specifically, concurrency testing and time testing. I've tried to download xrayspecs and compile it - Ant just gets to a certain point and fails. It is due to a bug - the build script depends on Java 5 (i.e. the one which comes installed on Mac OS X Leopard; Snow Leopard uses Java 6).

But, you don't need to worry because the primary thing you need xrayspecs for is concurrency testing. If you are using an up-to-date version of Specs, it has limited concurrency testing with 'eventually'. It works like this:

someObject.addThisToMutableAsyncCounter(2)

This call doesn't block - it modifies some mutable state of the object, but does it using something like a dispatcher to do background processing on a queue. Therefore, the method call is going to return as soon as it has added the thing to the queue.

To test this, you use this kind of matcher:

someObject.counterValue() must eventually(be(2))

This continues testing the call 'counterValue' repeatedly until it returns the value that it needs to (2). You need to make sure that counterValue() doesn't mutate state because that'll be called repeatedly.

Practice

Prerequisites

Using Scala, there are two ways you need to go about getting libraries. First, there's sbaz - Scala Bazaar. But not everything is in sbaz. In fact, most libraries you'll want to use aren't. So you need to get them as JAR files.

sbaz (output of `sbaz installed`):

JARs - this is from my ~/code/classes/:

You'll no doubt find that Commons and Log4j will help you about as much as they get on your nerves.

Build, config and deployment

Some Scala users use maven. I think it is too heavyweight.

Consider sbt or Apache buildr instead - build scripts in the former are written in Scala, while the latter are written in Ruby (with JRuby). sbt works great. Use that if you can - it supports Maven repositories and is pretty easy to extend.

If you use sbt, I've written some little scripts for sbt (in Ruby and Python) called sbt-growltest and sbt-notifytest which give you Growl and libnotify (the GNOME equivalent to Growl) notifications upon build and test success - they work with sbt and Specs, but could be adapted to work with other test frameworks.

Configgy is a library for configuration files - it is an alternative to the Java Properties library or using XML or YAML or whatever. I've heard great things about it, but haven't yet had a chance to test it.

Talking to the Web

Obviously, the first thing you need to be able to do is talk to the Web. That means using HTTP.

There are three ways of doing this at the moment:

Also see the IBM developerWorks article Scala and XML.

Using the Dispatch library we installed using sbaz, let's get something off the web:

import dispatch._

import Http._

var http = new Http()

var helloscala = http("http://tommorris.org/files/helloscala.txt" as_str)

Talking to OAuth

Databinder Dispatch has an OAuth module. It works well, but isn't particularly well-documented.

I put up a copy of a REPL session to show how to authenticate and use the FireEagle API using Dispatch's OAuth code.

Parsing XML

You can parse XML natively in the language using the scala.xml functionality. Scala is one of the few languages where it is actually easier to use XML than it is to use JSON or YAML or other lightweight serialization formats.

import scala.xml._

val string = /* whatever */

val doc = XML.loadString(string)

(doc\\"h1")(0).text

If you need to write XML, note that XML is a language literal. Try it. It's really cool. Open up a Scala REPL, type in "val foo = " then paste in a fragment of XML. Most of the Scala books have a chapter on XML literals and the scala.xml libraries.

Parsing JSON

If you want a native Scala library, try scala-json. Alternatively, use json.jar from Java.

My preferred way of parsing JSON is using json-lib. Here is how you parse json-time using Scala.

First, classpath. My sbaz is as above, and I add these jars to the classpath:

Here's the code:

import dispatch._

import Http._

import net.sf.json._

var http = new Http()

var jsontime = JSONObject.fromObject(http("http://json-time.appspot.com/time.json" as_str))

Now we can interrogate the object a bit:

jsontime.get("tz").asInstanceOf[String]

jsontime.get("hour").asInstanceOf[Int]

(Obviously, if we are going to use date/time objects, we should probably import JodaTime and scala-time and cast the objects returned from the JSONObject into the relevant date-time objects.)

Twitter and other APIs

Of course, what you really need is some more Twitter in your life. Using the HTTP and XML/JSON libraries listed above, it should be fairly easy to poll Twitter's APIs. But polling is old school. The new kids on the block use the firehose-style realtime APIs. For that you need acrosa's Scala-TwitterStreamer. It is based on Apache Commons HTTP and the code is very Javaish rather than FPish, but it works. Sadly, it doesn't actually come with a thing to turn the InputStreams of JSON into objects, but it isn't too difficult to write.

Other web service API wrappers (mostly Java):

Date Time

Rubyists have Chronic. There's Parsedatetime for Python. These are great because they allow natural language date parsing. Java and Scala users get JChronic.

Java also has JodaTime. Use that. Don't use the date-time libraries in Java - JodaTime is better.

BigInt

One thing people coming from Python/Ruby etc. need to remember is that you have to deal with the type system's handling of numbers. For integers and longs, you can use the literals provided by the language. Ints can be created by just using the integer literal - "var a = 5" will create an Int object and assign it to var a. To create a long, you use "var a = 5L". Suffixing the number with 'l' or 'L' (upper or lower case) will create a long. Floats can be created by suffixing with a lower or upper case 'F'.

For Big Ints - integers too big to fit into Int or Long - you should use BigInt. BigInt is like Java's BigInteger class, but easier.

In Java, you might do this: "BigInteger whatever = new BigInteger("82175986439779345802385989");"

In Scala, you can use BigInt, which is a class and has a helper object that lets you instantiate just like this: var a = BigInt("82175986439779345802385989")

You needn't instantiate using a string - you can use an int or a long.

Once you've instantiated, the key benefit of using Scala's BigInt class rather than Java's BigInteger class is simple: you can use the standard mathematical operators - +, -, * and /

So use that.

For number type conversion, you can use asInstanceOf. So if you have an Int and you want to turn it into a Long, just do something like: 5.asInstanceOf[Long] (Of course, with Int, it defines a toLong method.)

It's worth learning about asInstanceOf. Google it.

Web frameworks

Some web frameworks you may want to investigate:

Powered by WiGit