Elixir Tasks vs Scala Futures

Elixir has introduced a wonderful first class concept called a Task. It allows you to do some work in a new process and easily collect the result, like so:

iex(1)> something = Task.async fn ->  
...(1)>   10 + 6
...(1)> end
%Task{pid: #PID<0.44.0>, ref: #Reference<0.0.0.55>}
iex(2)> Task.await(something)  
16  

On first glance, it looks a lot like constructs found in other languages. For example, Scala has a Future construct that seems similar:

scala> val something = Future { 10 + 6 }  
something: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise@5535cbe  
scala> Await.result(something, 0 nanos)  
res1: Int = 16  

Seems pretty similar on first glance right? While it looks like they offer the same abstraction - the ability to asynchronously execute code and return the result - there are some key differences.

Poll based

Let's take a look at Scala Futures. Scala Futures are built on top of JVM threads, along with all the usual suspects - Mutexes, Synchronized Queues, etc. One way in which we can interact with futures is to see if they have completed:

scala> val something_long = Future { Thread.sleep(10000) }  
something_long: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise@1aa6a14b  
scala> something_long.isCompleted  
res7: Boolean = false  
...
scala> something_long.isCompleted  
res10: Boolean = true  

We can poll to see if the Future has completed. This, in fact, is at the core of how the Await.result method works (as of Scala 2.10). It polls the Future for completion until either the time runs out or the Future has completed. Once complete, the internal value of the Future is returned.

This polling mechanism is built upon the Future being in shared memory. Any thread can access it and therefore any thread can wait for it to complete. Multiple threads can fetch the result and the result can be fetched multiple times.

scala> val something = Future { 10 + 6 }  
something: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise@47041bea  
scala> Await.result(something, 0 nanos)  
res12: Int = 16  
scala> Await.result(something, 0 nanos)  
res13: Int = 16  

Push Based

Now let's look at Elixir Tasks. Tasks are built on top of message passing. Without the Task construct, the same thing can be (crudely) implemented using standard Elixir code:

iex(4)> parent = self  
#PID<0.40.0>
iex(5)> something = spawn_link fn ->  
...(5)>   send parent, 10 + 6
...(5)> end
#PID<0.54.0>
iex(6)> receive do  
...(6)>   x -> x
...(6)> end
16  

Tasks give us a bit more convenience than the above code, but at their core they are fairly simple. A new process is spawned. When it completes, it sends the response back to the parent process.

Because the result is pushed to the parent process, we find ourselves with a limitation. Task.await must be called from the parent process. If called from a different process, it will never receive the message and thus time out:

iex(9)> something_nested = Task.async fn ->  
...(9)>   Task.async fn ->
...(9)>     "nested"
...(9)>   end
...(9)> end
%Task{pid: #PID<0.73.0>, ref: #Reference<0.0.0.138>}
iex(10)> something = Task.await(something_nested, 1000)  
%Task{pid: #PID<0.74.0>, ref: #Reference<0.0.0.140>}
iex(11)> Task.await(something, 1000)  
** (exit) exited in: Task.await(%Task{pid: #PID<0.74.0>, ref: #Reference<0.0.0.140>}, 1000)
    ** (EXIT) time out
    (elixir) lib/task.ex:173: Task.await/2

A task spawned in another task fails to return a value, exactly as we expected. If we call await inside the first task, all is well:

iex(13)> something_nested = Task.async fn ->  
...(13)>   something = Task.async fn ->
...(13)>     "nested"
...(13)>   end
...(13)>   Task.await(something, 1000)
...(13)> end
%Task{pid: #PID<0.99.0>, ref: #Reference<0.0.0.214>}
iex(14)> Task.await(something_nested, 1000)  
"nested"

We also can't fetch the value twice:

iex(21)> something = Task.async fn -> 10 end  
%Task{pid: #PID<0.118.0>, ref: #Reference<0.0.0.259>}
iex(22)> Task.await(something, 1000)  
10  
iex(23)> Task.await(something, 1000)  
** (exit) exited in: Task.await(%Task{pid: #PID<0.118.0>, ref: #Reference<0.0.0.259>}, 1000)
    ** (EXIT) time out
    (elixir) lib/task.ex:173: Task.await/2

Failure

With Scala Futures, errors are largely silent, at least until you try to use the value:

scala> val somethingBroken = Future { throw new Exception("oops") }  
somethingBroken: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise@65e93514

scala> somethingBroken.isCompleted  
res18: Boolean = true

scala> Await.result(somethingBroken, 0 nanos)  
java.lang.Exception: oops  
 at $anonfun$1.apply(<console>:15)
 at $anonfun$1.apply(<console>:15)
 at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
 at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
 at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

With Elixir Tasks, an exception in a task will cause the parent process to crash. This is part of the Erlang/Elixir "Let it crash" philosophy.

iex(1)> self  
#PID<0.145.0>
iex(2)> response = Task.async fn -> raise "oops" end

=ERROR REPORT==== 12-Jul-2014::14:44:36 ===
** Task <0.149.0> terminating
** Started from <0.145.0>
** When function  == #Fun<erl_eval.20.106461118>
**      arguments == []
** Reason for termination ==
** {#{'__exception__' => true,'__struct__' => 'Elixir.RuntimeError',message => <<"oops">>},
    [{'Elixir.Task.Supervised',do_apply,2,
                               [{file,"lib/task/supervised.ex"},{line,70}]},
     {'Elixir.Task.Supervised',async,3,
                               [{file,"lib/task/supervised.ex"},{line,15}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
** (EXIT from #PID<0.145.0>) an exception was raised:
    ** (RuntimeError) oops
        (elixir) lib/task/supervised.ex:70: Task.Supervised.do_apply/2
        (elixir) lib/task/supervised.ex:15: Task.Supervised.async/3
        (stdlib) proc_lib.erl:239: :proc_lib.init_p_do_apply/3

Interactive Elixir (0.14.2) - press Ctrl+C to exit (type h() ENTER for help)  
iex(1)> self  
#PID<0.150.0>

Tasks fit in nicely with the rest of the OTP ecosystem, complete with supervisors and monitors.

Why this matters

If you are coming to Elixir from a language like Scala, you might be tempted to treat Tasks like you would Futures. This is generally a bad idea. While they represent similar intents, the implementation means that they provide different abstractions. A Future provides an abstraction for access of future shared state. A Task provides an abstraction for access of a future response that is limited to a single process. This follows the general pattern in Erlang/Elixir of not sharing state.

The effect of this difference needs to be understood when writing code.

In Scala we frequently provide libraries that return Futures. It allows us to compose functionality from a variety of sources and build pipelines of execution. The underlying execution is abstracted away. (A discussion of various ExecutionContexts is beyond this post.)

scala> val response = for {  
     |   a <- responseA
     |   b <- responseB
     | } yield {
     |   a + b
     | }
response: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise@3b05298d

scala> Await.result(response, 0 nanos)  

In Elixir we want to always be explicit about where code is being executed. Rather than have libraries launch tasks on our behalf, the better technique is for libraries to return functions. We can then launch these functions wherever we want:

iex(15)> fun1 = fn -> 25 end  
#Function<20.106461118/0 in :erl_eval.expr/5>
iex(16)> fun2 = fn -> 35 end  
#Function<20.106461118/0 in :erl_eval.expr/5>
iex(17)> response = Task.async fn ->  
...(17)>   responseA = Task.async fun1
...(17)>   responseB = Task.async fun2
...(17)>   Task.await(responseA) + Task.await(responseB)
...(17)> end
%Task{pid: #PID<0.109.0>, ref: #Reference<0.0.0.240>}
iex(18)> Task.await(response)  
60  

Conclusion

Elixir has been a blast to work with. The language gives you very powerful constructs that enable you to build powerful and robust systems. Understanding how to use these structures is crucial. In the case of Tasks it is important to always be explicit about where code is executing. If you have any questions feel free to ask here or on IRC in #elixir-lang.

comments powered by Disqus