(I'm drafting this at the moment. Feedback appreciated - tom@tommorris.org )

Thank you Hacker News and Reddit.

A lot of guides to Scala are written with an intended audience of academics and computer scientists, partly because Scala's development by Martin Odersky at EPFL means it's quite an interesting language on a purely intellectual level. Scala doesn't seem to be designed simply as a research project though: it's designed as a practical language. Here it fits a rather unique niche: it is supposed to be useful both for programmers who work in scripting languages (however you choose to define them) like Python, Ruby or Perl and for people more familiar with languages like Java, C# and C++. It really has the benefits of both. As a language, it has a strong type system, but you don't have to make explicit type declarations - instead, the compiler or interpreter will be happy to work out that in `var foo = 9` a different type is being instantiated than in `var foo = "9"`. The language also imports a lot of the constructs of functional programming languages like Haskell into a language that's strongly OO.

But is it a language for hackers? By hackers, I don't mean kernel hackers. I mean people who build small hacks - who use whatever technology is at hand to achieve some interesting end-goal. Hackers need to be able to do the really easy things quickly. BarCamps and HackDays require one to be able to turn around an idea overnight: a valuable skill in the commercial world also! Being able to rapidly prototype is a useful skill: it means you can "throw one away" quicker, you can drop ideas that are impractical and move on, and it means you can actually show something the next day and make people say 'Wow'.

I think that Scala could be a perfect language for them. I'm not sure of it. Unfortunately, the current Scala introductions tell them how to produce a good quicksort or use something like combinatorial parsing before telling them how to, say, read an Atom feed.

Scala guys may respond: "ah, but you just use the underlying Java". To which the non-Scala user (possibly non-Java user) says "okay, what's that?". Scala for Hackers is intended to solve this.

As I said, I'm not sure of it. Writing this document is a way for me to try and document the various things I'm doing with Scala. Hopefully Scala nerds might be able to use it to write libraries to make this stuff easier.

Theory

Obviously, you need to learn the basics of Scala before hacking with it. If you learn from dead-tree media, I suggest Odersky, Spoon and Venner's "Programming in Scala" (Artima) or Wampler and Payne's "Programming Scala" (O'Reilly). Odersky et al. is slightly more theoretical while Wampler and Payne is a bit more practical.

Immutability

In Scala, variables can be declared to be either mutable or immutable. Using immutable values has benefits if you are doing concurrent programming.

To declare a mutable variable, use 'var' - e.g. var foo = 5

To declare an immutable variable, use 'val' - e.g. val foo = 5

A simple way to remember: var is variable, val is a value.

Companion Objects

If you have used Ruby, you probably know the difference between an instance method and a class method - you define the latter by using 'def self.whatever' while the former uses just 'def whatever'. The class method is simply a method that can be used without an instantiating object. It is useful as you can use it for alternative constructors and the like. In Java, this is called a static method and you create it by prefixing the method definition with the keyword "static". In Python you use the @classmethod annotation.

In Scala, you can create static methods (and static properties) with companion objects. The companion object holds the statics while the class holds the instantiable methods/properties.

A companion object should be written in the same file (technically, the same compilation unit) as the class. Basically, put the companion object directly after the class.

Like this:

class Book (val name: String)

object Book { /* whatever */ }

Inside the object, just add your static methods like you would to any other object. Unlike Java, you don't need to prefix static methods with "static".

Compile the class and companion object together.

Introspection

If you use Ruby, you often want to see what methods exist on a class or object, so you do something like `"foo".methods`

In Scala, you can call 'getMethods' on a class, meaning you can do something like this:

"foo".getClass.getMethods.foreach(println _)

Here, the string "foo" is getting instantiated - Scala will figure out it is a string, then ask the instance to return the class. getMethods returns an array of the methods from the class. foreach loops across the array and (println _) prints 'em out.

But what if you don't want to instantiate the class to introspect on it's methods?

Class.forName("java.lang.String").getMethods.foreach(println _)

Class.forName is just taken from Java. getMethods returns an array of java.lang.reflect.Method objects (as opposed to Ruby which just returns an array of strings). The Method class is useful as you can use it to see what methods return what types, or take particular types as arguments. getReturnType() returns a Class object for

Here is an example of reflection I used when writing the XML parsing section:

(doc\\"h1")(0).getClass.getMethods.filter(_.getReturnType() == Class.forName("java.lang.String")).foreach(println _)

This takes the first h1 element in the document 'doc', gets all the methods one can call on that class, filters them to only those whose return strings and print those out. This way, I can figure out how out the 'get back a text node as a string' method - it's just called 'text'.

2.8.0 note: the REPL in the forthcoming 2.8.0 release - and the current nightlies, beta/RC releases and SVN head - has tab-complete for existing variables. You can't do "foo".[tab], but if you do var foo = "foo", you can then do foo.[tab] for bash-style tab completion for methods.

See the StackOverflow question Scala: How do I dynamically instantiate an object and invoke a method using reflection?

Duck typing is now structural typing

If you are coming from Ruby, Python or other dynamic languages, you may be used to duck typing - this is basically using the methods defined on the object as your type system rather than types being declared Java/C style. This is done by following existing method naming practice. So, in Ruby, if the object can be serialised as a string, you use "to_s". If it can be turned into XML, "to_xml", if you can iterate over it, you include the enumerable methods like "each". If you want to know if something is enumerable, you check it by doing "foo.responds_to?(:each)". This lets you write more flexible code.

In Scala, you may think, it's an old-fashioned static typed language like Java so you've gotta declare classes or - worse - interfaces. How boring. No sexy duck typing. Or, worse, you've got to specify Any (Scala's equivalent of Java's Object) as your type parameter - because you want to let it take anything, then just put in your documentation exactly what sort of object you need to pass it. This sucks.

Fortunately, you can do the equivalent of duck typing in Scala using structural typing. A structural type is a type defined by a set of what the specification calls 'refinements'. The refinements can be either methods or properties. A structural type can be declared using the type keyword or inline. An example:

def postToBlog(content: { def toHtml(): HtmlFragment }) = ...

This method takes as the first argument any object which matches the method's signature: it has to to be called 'toHtml' and return an (imaginary) 'HtmlFragment' object instance. The structural type is a value that you can assign a global name using the type method:

type Bloggable = { def toHtml(): HtmlFragment }

You can now use this as a type in the same way you can use classes as types (and ought to treat them syntactically in the same way you would classes IMHO - no horrible Hungarian notation like TBloggable or TypeBloggable or whatever - if you have a class and a type with the same name, that would seem to be a design problem):

def postToBlog(content: Bloggable)

Concurrency

There are a bunch of different ways you can do concurrency in Scala. Because Scala is Better Java, you can use java Threads just like you would in Java. You just write your thread classes in Scala instead. Scala doesn't have a synchronized keyword though - but that doesn't matter. synchronized is a method instead. If you are happy doing concurrency in the old-school way, go ahead.

Alternatively, you can use actors. For those coming from Ruby or Python, think EventMachine, Twisted, node.js etc. Scala's actors are similar to Erlang's actors. Message passing, no shared state. It's pretty lightweight and easy to do. There's about a gazillion tutorials out there. Type some combination of 'scala', 'actors' and 'tutorial' into Google and you'll bump into them. Also, the big Scala book by Odersky et al. has a tutorial.

There's more though. Akka extends actors to add software transactional memory (like Clojure!), events and a whole bunch of other coolness. Apparently. I haven't tried it, but I hear it is cool.

I'm a big fan of HawtDispatch. It isn't only for Scala - if you can tolerate verbosity, you can use it in Java. It implements Grand Central Dispatch on the JVM. Grand Central Dispatch is Apple's way of doing concurrency in C, C++ and Objective-C. It adds blocks to those languages which execute asynchronously on a system-wide thread pool. HawtDispatch does similarly for Java and Scala. The Scala API is 2.8 only, but you can use the Java API from 2.7.

I've found that a mixture of HawtDispatch and actors give you most of the concurrency you need when hacking something together.

Testing

I use Specs - it is BDD-style testing, very much like RSpec.

I need to look into Ostrich for performance testing.

There are alternatives to Specs, but I have found Specs to be the best thing to use so far.

Concurrency testing

Twitter has createdxrayspecs which adds extra magic to Specs - specifically, concurrency testing and time testing. I've tried to download xrayspecs and compile it - Ant just gets to a certain point and fails. It is due to a bug - the build script depends on Java 5 (i.e. the one which comes installed on Leopard; Snow Leopard uses Java 6).

But, you don't need to worry because the primary thing you need xrayspecs for is concurrency testing. If you are using an up-to-date version of Specs, it has limited concurrency testing with 'eventually'. It works like this:

someObject.addThisToMutableAsyncCounter(2)

This call doesn't block - it modifies some mutable state of the object, but does it using something like a dispatcher to do background processing on a queue. Therefore, the method call is going to return as soon as it has added the thing to the queue.

To test this, you use this kind of matcher:

someObject.counterValue() must eventually(be(2))

This continues testing the call 'counterValue' repeatedly until it returns the value that it needs to (2). You need to make sure that counterValue() doesn't mutate state because that'll be called repeatedly.

Practice

Prerequisites

Using Scala, there are two ways you need to go about getting libraries. First, there's sbaz - Scala Bazaar. But not everything is in sbaz. In fact, most libraries you'll want to use aren't. So you need to get them as JAR files.

sbaz (output of `sbaz installed`):

JARs - this is from my ~/code/classes/:

You'll no doubt find that Commons and Log4j will help you about as much as they get on your nerves.

Build, config and deployment

Some Scala users use maven. I think it is too heavyweight.

Consider sbt or Apache buildr instead - build scripts in the former are written in Scala, while the latter are written in Ruby (with JRuby). sbt works great. Use that if you can - it supports Maven repositories, is pretty easy to extend.

If you use sbt, I've written some little scripts for sbt (in Ruby and Python) called sbt-growltest and sbt-notifytest which give you Growl and libnotify (the GNOME equivalent to Growl) notifications upon build and test success - they work with sbt and Specs, but could be adapted to work with other test frameworks.

I need to look into Configgy.

Talking to the Web

Obviously, the first thing you need to be able to do is talk to the Web. That means using HTTP.

There are three ways of doing this at the moment:

Also see the IBM developerWorks article Scala and XML.

Using the Dispatch library we installed using sbaz, let's get something off the web:

import dispatch._

import Http._

var http = new Http()

var helloscala = http("http://tommorris.org/files/helloscala.txt" as_str)

Parsing XML

You can parse XML natively in the language using the scala.xml functionality.

import scala.xml._

val string = /* whatever */

val doc = XML.loadString(string)

(doc\\"h1")(0).text

Parsing JSON

If you want a native Scala library, try scala-json. Alternatively, use json.jar from Java.

My preferred way of parsing JSON is using json-lib. Here is how you parse json-time using Scala.

First, classpath. My sbaz is as above, and I add these jars to the classpath:

Here's the code:

import dispatch._

import Http._

import net.sf.json._

var http = new Http()

var jsontime = JSONObject.fromObject(http("http://json-time.appspot.com/time.json" as_str))

Now we can interrogate the object a bit:

jsontime.get("tz").asInstanceOf[String]

jsontime.get("hour").asInstanceOf[Int]

(Obviously, if we are going to use date/time objects, we should probably import JodaTime and scala-time and cast the objects returned from the JSONObject into the relevant date-time objects.)

Twitter and other APIs

Of course, what you really need is some more Twitter in your life. Using the HTTP and XML/JSON libraries listed above, it should be fairly easy to poll Twitter's APIs. But polling is old school. The new kids on the block use the firehose-style realtime APIs. For that you need acrosa's Scala-TwitterStreamer. It is based on Apache Commons HTTP and the code is very Javaish rather than FPish, but it works. Sadly, it doesn't actually come with a thing to turn the InputStreams of JSON into objects, but it isn't too difficult to write.

Other web service API wrappers (mostly Java):

Date Time

Rubyists have Chronic. There's Parsedatetime for Python. These are great because they allow natural language date parsing.

Java and Scala users get JChronic. Of course, Java also has JodaTime. Use that.

BigInt

One thing people coming from Python/Ruby etc. need to remember is that you have to deal with the type system's handling of numbers. For integers and longs, you can use the literals provided by the language. Ints can be created by just using the integer literal - "var a = 5" will create an Int object and assign it to var a. To create a long, you use "var a = 5L". Suffixing the number with 'l' or 'L' (upper or lower case) will create a long. Floats can be created by suffixing with a lower or upper case 'F'.

For Big Ints - integers too big to fit into Int or Long - you should use BigInt. BigInt is like Java's BigInteger class, but easier.

In Java, you might do this: "BigInteger whatever = new BigInteger("82175986439779345802385989");"

In Scala, you can use BigInt, which is a class and has a helper object that lets you instantiate just like this: var a = BigInt("82175986439779345802385989")

You needn't instantiate using a string - you can use an int or a long.

Once you've instantiated, the key benefit of using Scala's BigInt class rather than Java's BigInteger class is simple: you can use the standard mathematical operators - +, -, * and /

So use that.

For number type conversion, you can use asInstanceOf. So if you have an Int and you want to turn it into a Long, just do something like: 5.asInstanceOf[Long]

It's worth learning about asInstanceOf. Google it.

Web frameworks

Some web frameworks you may want to investigate:

Powered by WiGit