Symbol
s Finally Arrive in MRI Ruby 2.2.0MRI Ruby 2.2.0 was released a couple of days ago. I want to use this opportunity to talk about one of the new features shipping with 2.2.0
and explain why it matters: Garbage collectable Symbol
s.
There was no garbage collection for Symbol
s before 2.2.0
, meaning that all Symbol
objects you created were permanently kept in memory. Inevitably, this lead to problems, because the programmer had to be aware of memory management when working with Symbol
s in order to avoid leaks.
The Ruby community has discussed this issue for a long a time, but nobody was able to deliver a practical solution up until now. With the release of Ruby 2.2.0
, the problem has finally been resolved. To understand how the new garbage collection mechanism for Symbol
s works, we have to distinguish between hard-coded and dynamically created Symbol
s.
Now, the hard-coded variety, never posed any problems. The reason for that is of course, that all Ruby programs contain a manageable number of hard-coded Symbol
s. In fact, hard-coded Symbol
s are still not garbage collectable in 2.2.0
, which is easy to demonstrate:
2.2.0 :001 > GC.start
=> nil
2.2.0 :002 > Symbol.all_symbols.size
=> 3312
2.2.0 :003 > :foobar
=> :foobar
2.2.0 :004 > Symbol.all_symbols.size
=> 3313
2.2.0 :005 > GC.start
=> nil
2.2.0 :006 > Symbol.all_symbols.size
=> 3313
The evaluation of line number six, caused the total number of Symbol
s (measured with Symbol.all_symbols
) to increase by 1
. However, it didn’t decrease after garbage collection. Therefore, we can conclude that the garbage collector doesn’t reclaim hard-coded Symbol
s.
On the other hand, there are dynamically created Symbol
s. That’s the variety, we have to worry about. Most commonly, these Symbol
s are created by (directly or indirectly) converting String
s with String#to_sym
. For illustration, we can perform the same experiment again, but create the new Symbol
with String#to_sym
instead.
2.2.0 :001 > GC.start
=> nil
2.2.0 :002 > Symbol.all_symbols.size
=> 3312
2.2.0 :003 > "foobar".to_sym
=> :foobar
2.2.0 :004 > Symbol.all_symbols.size
=> 3313
2.2.0 :005 > GC.start
=> nil
2.2.0 :006 > Symbol.all_symbols.size
=> 3312
As you can see, our dynamically created Symbol
has been garbage collected: The total number of Symbol
s (measured in line number 12) decreased by 1
. This wouldn’t have happened in Ruby 2.1.5
.
All my previous examples are more or less benign, because there’s only a single Symbol
object involved. Things start to get more interesting if your program creates Symbol
s based on user-supplied data in long-running processes.
One of the places where you may encounter this in the wild, is the handling of HTTP parameters. For instance, the following Rack application splits the request’s query string and calls #to_sym
on each of the resulting substrings.
run Proc.new { |env|
params = env['QUERY_STRING'].split(',')
params.each(&:to_sym) # symbolize params...
['200', {}, []]
}
For applications like this one, the improvements in 2.2.0
are supposed to make a big difference. The Rack application above is perfectly fine in Ruby 2.2.0
, but it leaks memory in 2.1.5
. As established already, Symbol
s created with #to_sym
are garbage collectable in 2.2.0
, but not in 2.1.5
.
We can start the Rack application and monitor its memory consumption with ps
, while flooding it with HTTP requests at the same time. If we do that once for 2.1.5
and once more for 2.2.0
, we can plot the resulting data to get a better understanding of the different behavior.
Here’s the memory consumption (megabytes) graph for 100,000 requests:
Well, the 2.1.5
graph is pretty much what I expected. However, the 2.2.0
graph is almost identical – I’m still puzzled by that.
In theory, memory management has been vastly improved in 2.2.0
with the addition garbage collectable Symbol
s. In practice, long-running processes suffer from a significant build-up in memory consumption over time nonetheless.
Overall, my benchmark seems to indicate that Ruby’s memory management is far from perfect. A lot of work remains to be done…
Is my conclusion wrong? Do you have different experiences with 2.2.0
?
Notes
Software: MRI Ruby 2.1.5-p273, MRI Ruby 2.2.0-p0, Rack 1.6.0, and ps
.