Unit testing in Scala

This post is copied over from my internal blog inside IBM, just for posterity after I’ve left. It was first published in March 2011.

In my Baby Steps in Scala blog post I introduced the group sync problem I used to experiment with Scala as a language. After spending a bit more time learning the language features and syntax I switched focus to start looking at the ecosystem around it, including tooling and frameworks.

In my early experiments I simply used the Eclipse Scala plugin with straightforward project structures. The plugin is pretty basic and prone to crashes. The good news however is that Martin Odersky himself has been working on improving it.

As a strong believer in test-driven development I wanted to explore the ways that Scala code can be tested. I’d already managed to write Junit tests for the group sync example that called the Scala code, but I wanted to look more at specific frameworks for Scala testing. I also wanted to understand how Scala code can be integrated into a build environment.

I’ll cover the testing aspect in this post, and build in another.

A quick Google search showed there are three common testing frameworks around Scala:

The interesting thing is that the focus of all of them is not just “how can I test Scala code” but more on how the capabilities of the Scala language itself can be harnessed to produce tests which are much more natural and easy to define and understand. This is primarily done through Scala’s excellent support for DSLs.

All three frameworks are also more than just Unit test libraries and execution environments. They advocate specification and behaviour driven testing.

I didn’t spend much time investigating each, but based on material I’d read I chose Specs to play around with. Specs itself can make use of ScalaCheck anyway. Specs encourages up-front specification of test cases or behaviour in a very human readable form. For instance, here is the initial specification I wrote for my group sync service:

package com.adrianspender.groupsync.scala.test

import org.specs._

class GroupSyncSpecification extends Specification {

  "group sync service" should {
    "remove any cached groups if the user is no longer in any groups" in {}

    "not change the cache if the groups for the user have not changed" in {}

    "add groups to the cache if the user has been added to new groups " +
      "so that the cache matches the groups" in {}

    "remove groups from the cache if the user has been removed from groups " +
      "so that the cache matches the groups" in {}

    "update the cache when the user has been added to new groups and " +
      "removed from old ones so that the cache matches the groups" in {}

As you can see, the specifications are written in natural language that could easily be taken from the acceptance criteria of a story I’m working on. In true TDD/BDD fashion this can be written before any coding takes place. It can also be executed to produce the following output (in this case it is being run through sbt, but I’ll cover that in the entry on building)

(click the images to view full size)

The suite contains one test, known as a group (“group sync service” should) which then defines a set of examples such as “remove any cached groups if the user is no longer in any groups”.

We then expand the examples by specifying expectations, which is really the meat of our test and what would go inside a Junit test method. To take the first example, we need to test that if the user is not in any groups, then the cache should be emptied for them. You can look back at the Baby Steps in Scala post for the actual code we are testing.

At this point, if I was implementing this test in Junit, I’d start thinking about the collaborations that the class I am testing has, and how I can provide mock implementations of them with the behaviour I need. In this case I need:

  • A mock AuthoritativeSource which will return an empty set for the given user
  • A mock PersistentCache on which we need to assert that the removeAllGroups() method gets called

In Java I use the EasyMock library to provide a way to define mock objects with the behaviour I want, and to validate the runtime behaviour is as expected. Specs provides support for using EasyMock (and Mockito) within Specs and all you need to do is change your Specification to add the EasyMock trait:

package com.adrianspender.groupsync.scala.test

import org.specs._
import org.specs.mock.EasyMock
import org.easymock.EasyMock._

class GroupSyncSpecification extends Specification with EasyMock {

This then provides us with a Scala-friendly abstraction over the EasyMock library. We can now define our mock objects, set their expected behaviour, replay them and verify them (if you are unfamiliar with EasyMock, take a look at the tutorial here) With this in place, I am now ready to implement the expectations of the specification:

"remove any cached groups if the user is no longer in any groups" in {
      val authoritativeSource = strictMock[AuthoritativeSource]
      val persistentCache = strictMock[PersistentCache]

      val groups = new java.util.HashSet[String]()
      val personId = "1234"

      authoritativeSource.getGroups(personId) returns groups

      replay(authoritativeSource, persistentCache)

      val groupSyncService = new GroupSyncServiceImpl(authoritativeSource, persistentCache)

      verify(authoritativeSource, persistentCache)

Stepping through the code we are doing the following:

  1. Creating our mock objects for AuthoritativeSource and PersistentCache
  2. Creating an empty set for AuthoritativeSource to return and defining the userid we will use
  3. Setting the behaviour of our mocks:
    • We expect the getGroups() method of AuthoritativeSource to be called with personId, and we want it to return the empty group set.
    • We then expect the removeAllGroups(personId) method to be called on PersistentCache
  4. We replay both mock objects, so that EasyMock understands what should happen
  5. We create an instance of our GroupSyncServiceImpl – the class we are testing. The collaborators are injected through constructor injection
  6. We call groupSyncService.syncGroups(personId)
  7. We ask EasyMock to verify that the behaviour we expected actually happened.

And that is one part of our specification test implemented. We can then run it again:

You can see that the test we implemented is passing. We could break the test, for instance by doing this:

      val groups = new java.util.HashSet[String]()

So that now the group set returned from the AuthoritativeSource is not empty. Running this produces:

Here we expected one invocation of removeAllGroups() but there were none.

Let’s fix that and implement the rest of the examples. We then have a passing set of specification tests:

Finally, we have a working set of tests, but for integration into an existing Build/Continuous Integration environment, or even within Eclipse we may want to be able to execute them as JUnit tests. We can do this simply by changing the trait we use to the SpecificationWithJunit trait and annotating the class as such:

package com.adrianspender.groupsync.scala.test

import org.specs._
import org.specs.mock.EasyMock
import org.easymock.EasyMock._
import org.specs.runner.JUnitSuiteRunner
import org.junit.runner.RunWith

class GroupSyncSpecification extends SpecificationWithJUnit with EasyMock {

Then we can simply run the whole thing as a Junit test in Eclipse:

So, Specs is a really easy way to write human readable specifications and their implementations. What’s more, it can integrate with Junit and therefore existing CI/reporting mechanisms and existing testing libraries such as EasyMock.

However, what is perhaps most impressive is that all this ease of use and power can be applied to you Java code as much as any Scala code you have. There is nothing at all to stop you using Specs, or any Scala test framework, to test Java code. This is why testing is such a great vector for starting out with Scala and building skills. You can start to introduce it into an existing Java project orthogonally through your tests rather than diving straight into writing production code in Scala.

More fun with Scala

This post is copied over from my internal blog inside IBM, just for posterity after I’ve left. It was first published in February 2011.

I’ve been spending a lot of personal time increasing my knowledge and understanding of Scala. To spur me on I build up an introduction presentation which was presented to the IBM Dublin lab recently:

However, as previously stated, the best way to learn a language is to solve real problems in it, and one vein of problems are mathematical puzzles. I’m not a great mathematician so don’t spend all my time playing with Project Euler or similar, but when somebody posed one on a UK forum I frequent I couldn’t resist having a go in Scala. What’s more bizarre is that the forum is mainly about cars (but then cars and computer geeks sometimes go hand in hand)

Here’s the problem:

Find a 9-digit integer containing only the numbers 1-9 with no duplicates that is exactly divisible by 9.
Now remove the 9th digit to leave an 8-digit integer that is exactly divisible by 8.
Now remove the 8th digit to leave an 7-digit integer that is exactly divisible by 7.
And so on, down to a 1-digit integer that is obviously divisible by 1.

What is the 9-digit number? Is there more than one answer?

The first solution posted was in C# and absolutely unreadable. You can see it on the thread linked above. Then somebody posted a Java based solution that used a little bit of recursion, but was still very imperative. He claimed the “most elegant” solution prize, so I took that as a challenge to see what I could do in Scala.

To begin with I tried to bite off more than I could chew and tried to define a lazy stream of all 9 digit numbers that were divisible by 9. However I was ignoring the restriction that numbers could only consist of the digits 1-9 with no repeated digits. I was also trying to use parts of the language that I didn’t really understand yet.

After an hour or so I switched approach, and took the previously posted Java code and just “Scalafied” it. Then I began to swap out various loops by using higher order functions such as filter, and defining it to use a function implemented with some pattern matching to recursively divide a given number according to the spec. This was all simple enough to do. I hit one snag with the best way to convert an array of Char to an integer, and settled on using an implicit conversion (not strictly necessary as the conversion is only needed once, but I’d not defined one before).

So, here is version 1:

import scala.collection.mutable.Set

object ProblemScala extends Application {

  implicit def charArray2Int(c: Array[Char]): Int = java.lang.String.valueOf(c).toInt

  val digits = Array('1', '2', '3', '4', '5', '6', '7', '8', '9')
  val permutations = Set[Int]()

  def permute(digits: Array[Char], n: Int): Unit = n match {
    case 1 => permutations += digits
    case _ => {
      for(val i <- 0 until n) {
        val j = n - 1
        swap(digits, i, j)
        permute(digits, j)
        swap(digits, i, j)

  def swap(digits: Array[Char], i: Int, j: Int): Unit = {
    val c = digits(i)
    digits(i) = digits(j)
    digits(j) = c

  def divisible(n: Int, divisor: Int): Boolean = divisor match {
    case 1 => true
    case _ => n % divisor == 0 && divisible(n / 10, divisor - 1)

  permute(digits, digits length)
  val result = permutations filter (n => divisible(n, 9))

  result.foreach(i => println("match found: " + i)) 
  println("total match count: " + result.size)

The permute and swap functions are used in building a set of all possible 9 digit numbers that can be made up of the digits array contents. This was pretty much a straight copy from the Java code, except I use pattern matching in permute rather than an if/else.

The main difference as mentioned is the divisible function and how that is used in calculating the value of ‘result’. I was quite pleased with that. Overall it read much cleaner than the Java version in my eyes. However I was not happy with this solution. It doesn’t drastically cut down the line length and there’s still too much Java-like stuff going on. It is however a good example of how Scala lets you dip your toes into the water slowly rather than throwing you in at the deep end. Just take your existing code and “Scalafy” the syntax, then start looking for the loops, vals and Units, and start refactoring them out.

Once a days work got out of the way, I started googling around for a better solution to the problem of determining the permutations of the 9 digit numbers. In the meantime, a couple more solutions had been posted in Lua and Python. Both used useful library functions for building permutations, so my thoughts turned to what I might be missing in the Scala libraries. Sadly, there is nothing directly useful in 2.8.1, and many Google results that discuss generic solutions which are quite complex and would turn my solution into an unreadable mess. I did however find reference to the fact that the as yet un-final Scala 2.9.0 does have a SeqLike.permutations function. I set up a new SBT project (more on SBT in another post) using the latest 2.9.0 nightly snapshot and got to work. The result is quite amazing:

object ProblemScala2 extends App {

  def divisible(n: Int, divisor: Int): Boolean = divisor match {
    case 1 => true
    case _ => n % divisor == 0 && divisible(n / 10, divisor - 1)

  "123456789".permutations filter (n => divisible(n.toInt, 9)) foreach(n => println("match found: " + n))

The library function allows the String “123456789” to be implicitly converted to a SeqLike when the permutations function is called on it. The result is an Iterator of all permutations. That can then be used as the input of the rest of my first solution. All the cruft code in the first solution to define all the permutations is completely gone and what’s left is an amazingly concise and readable solution. I think it wins the “Most elegant” prize and I’ll be surprised if anybody comes up with something better.

Scala 2.9.0 should be released in the next few months. It is not a major evolution (compared to 2.8) but this is one fun example of how the language and core libraries are getting better and better.

Baby Steps in Scala

This post is copied over from my internal blog inside IBM, just for posterity after I’ve left. It was first published in January 2011.

As a new years resolution to myself I have started to learn Scala, a static, type-inferenced; object-oriented but yet functional programming language that runs on the JVM. After ten years of concentrating mainly on server side Java programming I figured it was time to broaden my skill set a bit, as well as stretch the brain cells. I chose Scala because at heart I am a middleware/server guy and whilst I’ve worked on a product and with a team that does a lot with Javascript, it has never really grabbed me.

I’ve started reading Beginning Scala by David Pollak via Books 24×7, as well as various web articles, including a series on DeveloperWorks. However, the best way to learn new things is always to apply them to real problems. Therefore I took one particular, fairly self contained coding problem from Lotus Connections 3.0 and re-implemented it in Scala. Here’s how.

Group Synchronization in Scala

In various places in Connections we support access control via LDAP groups. We have no way of knowing when group membership changes, so typically retrieve the list of groups a user is in when they log in. This information is then used a lot during their session so it is beneficial to cache it (typically for ten minutes before we refresh the cache) It may also be beneficial to cache this data persistently so any part of the system can use it. Ideally we would simply use a distributed in-memory cache such as WebSphere Extreme Scale, but we don’t yet, so where appropriate we cache some things in the database.

So, there is a fairly straightforward bit of code that would need to synchronize the groups the user is in with the cached set of groups we have for the user in the DB. Most of the time they will simply be the same, but sometimes they will be added to new groups, or removed from old ones. So what this code would basically do is:

  1. Retrieve the list of groups the user is in from the authoritative source (e.g. LDAP)
  2. If they are in no groups, ensure the cached entries (if any) are removed)
  3. If they are in one or more groups, compare that with the cached version and add any new groups to the cache, and remove any groups from the cache they are no longer in.

For my Scala investigation I started out with the following assumptions:

  • I need to call the implementation of this sync service from existing Java code
  • I have two existing Java services for interacting with the authoritative source and the persistent cache.
  • I have an existing Java interface for the sync service that I need to ensure is implemented

One of the great things about Scala is that because it runs on the JVM and is compiled down into byte code, it is fully able to interact with existing Java code. Java can invoke Scala and vice versa. Scala classes can implement Java interfaces and all sorts of goodness. You can run Scala code in WebSphere.

So, first off, here is the Java Interface of the sync service itself. It is very simple:

 * Provides methods to synchronize data stored locally in a 
 * persistent cache with the authoritative source.
 * @author aspender
public interface GroupSyncService {

	 * Retrieve the groups that the given person ID is in
	 * from the definitive resource, and update the 
	 * persistent cache that we use ourselves.
	 * @param personId the id of the person to sync
	public void syncGroups(String personId);

I also have two existing services defined by interfaces called AuthoritativeSource and PersistentCache. The former provides a single method to retrieve the groups the user is in. The latter retrieves the cached copy of the groups, as well as providing methods to update the cache.

So, as a starting point, here is an example Java implementation of the GroupSyncService:

package com.adrianspender.groupsync.java.impl;

import java.util.HashSet;
import java.util.Set;

import com.adrianspender.groupsync.java.*;

 * The Java version of our group sync service.
public class GroupSyncServiceImpl implements GroupSyncService {
	private AuthoritativeSource authoritativeSource;
	private PersistentCache persistentCache;

	public GroupSyncServiceImpl(AuthoritativeSource authoritativeSource, PersistentCache persistentCache) {
		this.authoritativeSource = authoritativeSource;
		this.persistentCache = persistentCache;

	public void syncGroups(String personId) {

		Set<String> currentGroups = authoritativeSource.getGroups(personId);
		if(currentGroups.size() == 0 ) {
		} else {
			Set<String> cachedGroups = persistentCache.retrieveAllGroups(personId);
			Set<String> addedGroups = new HashSet<String>();			
			for(String groupId : currentGroups) {
				if(!cachedGroups.contains(groupId)) {
			persistentCache.addNewGroups(personId, addedGroups);

			Set<String> removedGroups = new HashSet();
			for(String groupId : cachedGroups) {
				if(!currentGroups.contains(groupId)) {
			persistentCache.removeGroups(personId, removedGroups);

I’ve deliberately left out comments and it also isn’t as defensive as i’d normally write code (e.g. no null checks – something which Scala doesn’t need so much anyway) The code is easy to follow, but note the fact there are two for loops doing the calculation of the difference between the two sets of groups (first one works out the groups that are not in the cache, the second works out the groups that should no longer be in the cache)

So, here’s my first go at a Scala version:

package com.adrianspender.groupsync.scala.impl

import com.adrianspender.groupsync.java._
import scala.collection.JavaConversions._

 * The Scala version of our group sync service..
class GroupSyncServiceImpl(authoritativeSource: AuthoritativeSource, persistentCache: PersistentCache) extends GroupSyncService {
	def syncGroups(personId: String) : Unit = {

		val currentGroups = authoritativeSource.getGroups(personId).toSet
		if (currentGroups.size() == 0) {
		} else {
			val cachedGroups = persistentCache.retrieveAllGroups(personId).toSet
			persistentCache.addNewGroups(personId, currentGroups diff cachedGroups)
			persistentCache.removeGroups(personId, cachedGroups diff currentGroups)

This does exactly the same thing. The unit tests I wrote work exactly the same regardless of whether you use the Java or Scala version. The same Java Junit test works against both. Also note that the Scala version uses exactly the same Java implementations of AuthoritativeSource and PersistentCache. And that the Scala class implements the same GroupSyncService Java interface.

Of course, the other major thing to notice is the syntactic difference. There is no constructor – the variables of the class are defined on the class signature. There are no semi-colons. There are strange looking statements like ‘currentGroups diff cachedGroups’ which to a Java programmer may look more familiar if written as currentGroups.diff(cachedGroups) – which you could. However the syntax is still inherently familiar and easy to understand. This is certainly a big benefit of Scala. There’s much less need for nested blocks of code thanks to some of the syntactic sugar, as well as some of the capabilities of the languages and it’s libraries. The Java class is 47 lines to Scala’s 24 lines.

I’m not going to go into a line by line analysis of the Scala version, this blog post isn’t meant to be a tutorial. However one other point of note is how easy the collection libraries in Scala make working with arrays, lists, sets and maps. In this case I use the diff member of the scala.collection.immutable.Set class to extract the values from one set that are not present in the other. Simple, and something that in Java you have to loop to do. The collection capabilities really come into their power when combined with the functional aspects of Scala – for instance being able to pass a functional block of a code as a parameter to a method allows you do to complex things in single lines of code. There isn’t an example of functional programming in this particular example, but it is definitely something I need to get my head around further.

Olympics ticket application

One of the benefits of moving back to the UK is that we will be around for the London 2012 Summer Olympics. In fact, I actually could not conceive of being anywhere else during that time, sad as that may be.

Therefore I’ve applied for a bunch of tickets for various events. The way the system works is that you can apply for as much as you want (well, up to a maximum of 45 sessions) but are not guaranteed anything. Once the application period closes (April 26th 2011) any over-subscribed sessions will have a random selection process applied to decide who gets what.

There are various strategies to take to try and maximise your own outcome, depending on what approach you take. Personally, I want to see some of the sports I have real interest in (Swimming, Cycling especially) more than just going to a random event here and there because they may be lower in popularity. I also focussed primarily on sports in the Olympic Park given that entry to the park itself will be pretty good (I fully expect a lot of “Henman Hill” type atmosphere) To that end, I don’t expect to get all of the stuff I’ve applied for, which is a good thing given the total bill in that case would be touching £3,000!

So, for the record, here’s what I’ve applied for. I’ll report back in May/June with details of what the outcome is.

Friday 27th July

19:30 – 22:30 – Opening Ceremony x 4 @ £150 – Had to be done. Very low expectation here. Especially as I went for four tickets and they will only allocate you the amount requested, not less, should it go to ballot which this obviously will.

Saturday 28th July

10:00 – 13:00 – Swimming x 4 @ £28 – First swimming session of the games. Won’t be any Michael Phelps in the 400m Individual Medley but maybe he will try the 400m Freestyle instead (unlike Ian Thorpe, should he be selected). Should see Rebecca Adlington in the Women’s 400m Free.

14:30 – 18:30 – Women’s Basketball x 4 @ £20-35 – One of Lana’s choices, but filler material for me. Fun sport to watch however.

19:30 – 21:35 – Swimming x 4 @ £50-95 – First swimming medal session seeing four finals: Men’s 400m Free, Men’s/Women’s 400m IM, Women’s 4×100 Free relay

Backup plan for the day is to instead go and watch some of the Men’s Cycling Road Race which is a free event. However the only sensible place to watch would be Box Hill, and it is going to be insanely busy. Road Racing is always best on the telly.

Sunday 29th July

May go and watch the Women’s Cycling RR, but it’s shorter and there’s less laps of Box Hill so it will be a blink and miss affair at a given point. Still, there should be a good chance of a British medal or two.

Wednesday 1st August

Work day, but if possible would love to head down to Hampton Court to see both Men’s and Women’s Cycling Time Trials. Another free event.

Saturday 4th August

10:00 – 11:30 – Track Cycling x 1 @ £65 – A short session including the Men’s Omnium flying lap and early stages of the Men’s Individual Sprint. In general it’s a bit too early to tie down who is going to do what in the British track teams, especially with the new rule about there being only one place for each nation per event, but Sir Chris Hoy or Jason Kenny should be in the sprint, with Ed Clancy likely in the Omnium.

10:30 – 14:15 – Equestrian – Jumping x 2 @ £28 – One for Lana and my mum in Greenwich Park.

16:00 – 18:40 – Track Cycling x 3 @ £50 – Medal session for the Women’s Team Pursuit which should see Wendy Houvenaghel +2 bringing home a medal hopefully. This is an event that the British team is particularly strong in, winning gold in the World Championships earlier this year. Competition for the three spots should be fierce with Rebecca Romano, Sarah Storey, Laura Trott and Lizzie Armistead plus more in the frame. Other disciplines in the session include the Men’s Omnium elimination (devil) race which is an amazing spectacle as one rider gets eliminated every second lap, and more Men’s Sprint action. All in all this should be a cracking session.

18:50 – 22:05 – Athletics x 3 @ £95-150 – Not the hugest Athletics fan, but a medal session in the Olympic Stadium has to be worth a punt. The highlight of the night is the conclusion of the Women’s Heptathlon so hopefully medal action of the right kind for Jessica Ennis! There is also the small matter of the Women’s 100m final! Other medal finals are the Men’s/Women’s 10,000m, Men’s Long Jump and 20k walk (woo hoo!) Men’s Shot Put and Women’s Discus.

Sunday 5th August

10:00 – 11:25 – Track Cycling x1 @ £65 – Men’s Omnium Individual Pursuit and Women’s Sprint qualifying.

16:00 – 19:05 – Track Cycling x1 @ £95-150 – Medal Session for the conclusion of the Men’s Omnium with the final 15k scratch and 1k TT events. Men’s and Women’s Spring repercharges, 1/4 finals and various other pre-finals. Should see Hoy and Pendleton in action.

Monday 6th August

16:00 – 19:05 – Track Cycling x1 @ £95-150 – Medal Session for the Men’s Sprint. Hopefully with British interest in the final stages (Hoy or Kenny) and surely the awesome Frenchman Gregory Bauge. Start of the Women’s Omnium and more Women’s Sprint.

Monday 7th August

16:00 – 18:30 – Track Cycling x1 @ £150-225 – Final track session of the games with the Men’s Keirin up for grabs (surely Hoy will contest this) final events of the Women’s Omnium and also the finals of the Women’s Sprint (Pendleton)

Saturday 11th August

13:30 – 16:05 – Rhythmic Gymnastics x2 @ £125 – Women’s Individual all round final. One for Lana. Held at Wembley Arena.

Sunday 12th August

14:30 – 17:40 – Water Polo x2 @ £65 – Men’s final. Interesting sport that I’ve played before. Will be a good watch. However there is an ulterior motive at play here. It’s on the final day, in the Olympic Park. I don’t want to pay to see the closing ceremony itself, but I reckon there will be a pretty fantastic fireworks display to end things off, as well as just being around the park to soak up the atmosphere.

Goodbye Big Blue

After over twelve years I am leaving IBM.

At the beginning of May I will be starting a new job at the Financial Times, working on the ft.com team as a Senior Developer. I’ll be working in London at the FT offices at Southwark Bridge, and Lana and I are moving from Ireland back to the UK. We are looking to base ourselves in St. Albans so I’ll be joining the commuter rat race on a daily basis.

After 8 1/2 years in Hursley (covered here) and nearly four years in Dublin working on Lotus Connections, leaving has not been a decision I’ve taken lightly. That being said, it is time for a new challenge and to experience a different environment. I’m incredibly excited by the new opportunity and also the team I’ll be working with on software engineering projects to help drive forward the already successful digital delivery of such an iconic news organization. I’m also looking forward to the opportunity to further expand my experience of agile development, something that the ft.com team practice heavily, and to get the opportunity to use and assess technologies which for various reasons would not be options in the type of products I’ve worked on in IBM.

I’ve worked with a huge number of fantastic people in my time in IBM and many that I consider not just as colleagues but as good friends. It would be great to keep in touch via LinkedIn and/or Twitter.