Episode #045: Hash Default Value

Upgrade to download episode video.

Episode Script

Back in Episode 32, we learned about passing a block to the Hash constructor in order to provide a default value for missing keys.

  text = <<END
I'm your only friend
I'm not your only friend
But I'm a little glowing friend
But really I'm not actually your friend
But I am
END

word_count = Hash.new do |hash, missing_key|
  hash[missing_key] = 0
end

text.split.map(&:downcase).each do |word|
  word_count[word] += 1
end
word_count

If you’ve ever taken a look through the Hash documentation, you might have noticed that a Hash can also take a default value as an argument as well as in the form of a block. And in fact, this form works just fine for our word count hash:

word_count = Hash.new(0)

text.split.map(&:downcase).each do |word|
  word_count[word] += 1
end
word_count

When we use the default value argument in other scenarios we run into trouble though. For instance, here’s a hash whose default value for missing keys is an empty array. This lets us append values to hash members without explicitly initializing them.

h = Hash.new([])
h["IPAs"] << "Victory HopDevil"
h["IPAs"] << "Weyerbacher Double Simcoe"
h["IPAs"] 
# => ["Victory HopDevil", "Weyerbacher Double Simcoe"]

But when we start adding values to more than one key, we discover a problem. All of our values are being appended to a single array!

h = Hash.new([])
h["IPAs"] << "Victory HopDevil"
h["IPAs"] << "Weyerbacher Double Simcoe"
h["Stouts"] << "Victory Storm King"
h["Stouts"] 
# => ["Victory HopDevil", "Weyerbacher Double Simcoe", "Victory Storm King"]

If we give a name to our default value array, we can see what is happening more clearly: all of the values are being added to the single array that we passed in as the default value.

default = []
h = Hash.new(default)
h["IPAs"] << "Victory HopDevil"
h["IPAs"] << "Weyerbacher Double Simcoe"
h["Stouts"] << "Victory Storm King"
default
# => ["Victory HopDevil", "Weyerbacher Double Simcoe", "Victory Storm King"]

This happens because Hash uses the same default object everywhere. It doesn’t duplicate it before use. By contrast, when we use a default block instead of a default value, the block is executed every time a default value is needed, thus generating a new Array object every time.

h = Hash.new { |h, k| h[k] = [] } # !> shadowing outer local variable - h
h["IPAs"] << "Victory HopDevil"
h["IPAs"] << "Weyerbacher Double Simcoe"
h["Stouts"] << "Victory Storm King"
h["Stouts"]
# => ["Victory Storm King"]

This behavior of Hash default values is one of those non-obvious gotchas that turns into a real head-scratcher the first time you run into it. I hope that by showing it to you today, I’ve saved you some debugging time down the road.

Until next episode, happy hacking!