RethinkDB and Elixir - Part 1: Connections

This is going to be a big many part post about my experiences building a RethinkDB driver for Elixir. The driver can be found at Part 2 can be found here.


Building the connection was one of the more interesting pieces of the project. It required learning a lot about OTP patterns for fault tolerence. I highly recommend reading, as it highlights a lot of the subtleties that make OTP so great.

In this post I wanted to highlight the approach I took in building the connection code in the RethinkDB driver for Elixir.


  1. Establish and manage tcp connection to server
  2. Serialize query and send to server
  3. Receive response from server and reply to client


Connection was built on top of fishcakez/connection, a fantastic library that handles backoff and allows connecting asynchronously on start.


It's generally a good idea to expect network failures. The guarantees provided by RethinkDB.Connection do not extend to the network. As such, it does not guarantee that a connection is currently available. The default behavior is therefore to connect asynchronously. This means that your application will start up successfully even during a network partition but any attempted queries will result in a ConnectionClosed response. It will keep trying to establish a connection with exponential backoff until it succeeds. If a connection is established and then broken, the connection will clean up any pending queries and allow a supervisor to restart it, giving it a fresh state.


Sometimes we can safely ignore network partitions. This is generally the case when the database is on the same host as the application. If that's the case, specifying sync_connect: true will require the connection to succeed before successfully starting. This is especially useful if a local RethinkDB proxy is running on the application host.


The RethinkDB binary protocol consists of queries being assigned tokens. Each query sent over the wire has an accompanying token and each response includes the token of the associated query. This allows us to interleave queries and responses and thereby share a single connection among multiple clients. Unofficial tests show little difference between using multiple connections and sharing one connection for most queries, however large datasets in theory (or large queries) will limit throughput per connection.

  1. Queries are made via
  2. handle_call returns {:no_reply, state}, with a mapping of token to from being stored in state
  3. The TCP connection is active (technically, active: :once) so data is pushed to the process and handled by handle_info
  4. When a full response is received in handle_info, the token is used to lookup the from value in the original handle_call. GenServer.reply(from, response) is then called.

This works very well. The Connection process never has to wait for a response and therefore never blocks.

Serialization and Deserialization

All serialization and deserialization is done in the client process. Moving those CPU intensive tasks into the server process drops throughput by an order of magnitude. By making the clients handle it, we can distribute the load and maximize throughput.


Take a look at the documentation for more information. At this moment in time, the library has not hit 1.0.0, so nothing is set in stone. However, the API is meant to be stable. Feel free to file any issues or raise any ideas.

comments powered by Disqus