email github twitter twitter twitter

nasseri.io

The Blog of Dean Nasseri

Threading in MRI Ruby for Fun and Performance

Posted on Thursday, Dec 24, 2015

Frequently in our applications as software engineers, we run into situations where we want our code to do multiple things at once. This is the problem of concurrency. Concurrency has an infinite number of use cases. One such use case is that of a webserver. If you were to use a non-concurrent webserver, it would only be able to handle one request at a time. Imagine your site received several requests at the same time. Now the webserver must process each request one at a time, unable to move on to the next request in the queue until the previous one has completed. In order to alleviate this, we might use a concurrent webserver which is capable of processing multiple requests at once.

Concurrency vs Parallelism

Before I continue any further, I would like to clear up a point of confusion. This is well worn territory, but to quote Rob Pike, (concurrency is not parallelism)[https://blog.golang.org/waza-talk]. Concurrency means multiple paths of execution can be run and complete in the same time period. Concurrency does not mean that these tasks are running at the exact same time. In a concurrent program utilizing threads and running on a computer with a single CPU core, the OS would switch between the threads, and at no point would the threads actually be executing simultaneously, even if they appeared to. Parallelism means the threads are literally executing simultaneously. Imagine an ATM with 2 queues. Only one person uses the ATM machine at a time, and the next person to use the ATM alternates between the two queues. This is concurrency. Now imagine two ATM machines, each with its own queue. This is parallelism.

Global Interpreter Lock

MRI has a global interpreter lock (GIL). The GIL means that in a multi-threaded context, only one thread can execute Ruby code at any moment in time. Some posit that, due to the GIL, true concurrency is not possible in ruby programs. Even if we were to run our program on multiple cores, the GIL would prevent more than one thread executing at a time. This is a significant tradeoff of the language, but the GIL also means that we don’t have to worry about corrupting data, or race conditions within C extensions. For these reasons, threading is much easier to deal with in Ruby than in many other languages, even if it is less powerful.

“But hold on!” You might be exclaiming. “If only one thread can execute at a time, how will my code run any faster when using multiple threads of execution?”

Great question! The answer is that the GIL only applies to ruby operations, and the work that ruby is doing. Things like I/O or querying a database are free to execute on multiple threads at the same time.

Basic Threading

In Ruby, concurrency can be achieved using threads. Since version 1.9, MRI Ruby has supported native (OS level) threads.

Take a look at the following code.

thr = Thread.new { puts "Hello, thread world!" }

Running this program will yield… nothing. The issue is that the main thread creates a new thread, but as soon as it does the program terminates, and the second thread is terminated alongside it. What we have to do is suspend the execution of the main thread and wait for the second thread to terminate. To do this we have to join the thread.

Thread.join

Thread.join is what allows us to wait for the second thread to terminate. Typically, when a ruby program exits, all running threads will be killed. However, when we join a thread, the execution of the calling thread is suspended until the joined threads have finished running.

thr = Thread.new { puts "Hello, thread world!" }
thr.join

Running the program now will yield “Hello, thread world!” to STDOUT as expected.

Now lets move on to another example and try to really visualize what happens when we start creating and joining threads.

thr1 = Thread.new do
 puts 'a'
 sleep 2
 puts 'b'
 sleep 2
 puts 'c'
end
thr2 = Thread.new do
 puts '1'
 sleep 2
 puts '2'
 sleep 2
 puts '3'
end
thr1.join
thr2.join

Running this program on my machine a few times yields a few different outcomes. Take a look.

$ ruby ruby-threading-ex2.rb
a
1
b
2
3
c
$ ruby ruby-threading-ex2.rb
1
a
2
b
3
c

The inconsistent results lead to some interesting conclusions. Though we initialized thr1 first, ruby makes no guarantees that it will finish initializing before thr2. Furthermore, there is not guarantee in what order the blocks passed to each thread will execute.

Thread and Fiber Local Variables

Often times, we need a way to reach into a given thread and pull out a value. We can do so using one of two methods. The first way is by riding the thread like so:

Thread.current.thread_variable_set("foo", "bar"). 

We can also ride the current fiber:

Thread.current["foo"] = "bar"

The difference is a matter of scoping. Here is an example from the docs:

Thread.new {
 Thread.current.thread_variable_set("foo", "bar") # set a thread local
 Thread.current["foo"] = "bar" # set a fiber local
  Fiber.new {
   Fiber.yield [
     Thread.current.thread_variable_get("foo"), # get the thread local
     Thread.current["foo"], # get the fiber local
   ]
  }.resume
}.join.value # => ['bar', nil]

Thread locals are carried along with threads, and do not respect fibers. That is the only difference we will concern ourselves with here. There will be more on fibers in part 2 of this series of blog posts. For now though, know that we can set fiber locals. The beauty is that we can access the locals after the thread has finished executing. So we might do some computationally expensive work, set the value to a fiber local, and then access them once the thread has finished executing. For instance:

threads = []
threads << Thread.new do
 sleep(2) # do something expensive
 Thread.current[:foo] = 'result1'
end
threads << Thread.new do
  sleep(2) # do something expensive
  Thread.current[:foo] = 'result2'
end
threads.map(&:join).map do |thr|
 puts thr[:foo]
end
$ ruby fiber-local-ex.rb
result1
result2

This is an extremely common pattern when implementing concurrency in ruby.

A Practical Example

Recently at VTS we rolled out a new filtering system across the entire site. Upon each request, the filters were performing a series of expensive aggregate queries. Each query looked something like this:

def deal_types
 Deal
 .where{ deals.id.in(my{ deal_ids } ) }
 .group{ type }.pluck_all(
 'deals.type as item_id',
 'deals.type as label',
 "sum(case when deal.id in (#{ deal_id_subset exclude: :types }) then 1 else 0 end) as count"
 )
end

The result set of all the queries was serialized as a hash as follows:

def as_json
 {
   deal_types: deal_types,
   
 }
end

The queries ran one at a time, and the filter response began timing out. After tuning the queries as best we could, we still had timeout issues. One of our engineers came up with the solution to use threads and run the queries concurrently.

def parallelize &block
 Thread.new do
   Thread.current[:output] = yield
   ActiveRecord::Base.clear_active_connections!
 end
end

The new queries were then called and serialized as follows.

def as_json_paralellized
 {
   deal_types: parallelize { deal_types },
 }.each_value(&:join).map{|k, v | [k, v[:output]] }.to_h
end

Each query operates in its own thread. We then join the threads and return the resultant fiber locals. Benchmarks revealed this method to be significantly faster, with about a third as much User CPU time (the number on the far left), and wall clock time (the number on the far right). These benchmarks were taken on a local server, and not on our production application.

viewthespace(dev)> puts Benchmark.measure { as_json }
0.310000 0.020000 0.330000 ( 0.415927)
nil
viewthespace(dev)> puts Benchmark.measure { as_json_paralellized }
0.100000 0.020000 0.120000 ( 0.154479)
nil

The reason the concurrent version runs faster is because, as I said earlier, the GIL only cares about ruby work. After the SQL query is called, the ruby process releases the GIL as it waits for the database to respond. This is an ideal issue for concurrent ruby since I/O operations do not factor into the GIL.

Notice that we must call

ActiveRecord::Base.clear_active_connections!

Additionally, when we first deployed this code, we ran into errors where our unicorn workers would run out of ActiveRecord connections. We had stuck with the default limit of 5 maximum connections per worker. But in this case we were going to surpass that, because each thread we spawned required its own database connection. Since there were 7 aggregate queries, we bumped the limit up to 8. One connection for each possible thread running an aggregate query, and one for the main thread.

The takeaway here is that, while we did see significant performance gains, it was obvious that we were going “outside the framework” in a way that added complexity, and might seem un-rubyish. When it comes to your own application, consider these tradeoffs before deciding which solution to implement.

I hope this article has made threading in ruby a little easier to understand and shown how it can be used practically in a modern application. Thanks for reading!

Back