Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Wednesday, January 19, 2011

How To Build Hashes

A hash is an associative array. Instead of using numerical indexes, any object (but typically symbols) can be used. They can be both a fast way to retrieve an object based on a key, and a convenient key/value store. There's no doubt, hashes are one of the most used types in Ruby.
But how do you build a Hash? How do you insert data into it other than assigning keys to values one at a time? There are a few methods of building hashes you should know about.

This is by far the most common way to put data into a Hash. This is best suited for small hashes, one-off parameters to methods, or struct-like objects. The most common way to do this is the "rocket" syntax.

#!/usr/bin/env ruby

hash = {
 :key => 'value',
 :key2 => 'value2'
}

p hash
This is the easiest way to do this in all code up to version 1.9.2. There, a new syntax was introduced. And while this syntax works on 1.9.2, it won't work on earlier versions, so be aware before using it.

#!/usr/bin/env ruby

hash = {
 key: 'value',
 key2: 'value2'
}

p hash
This is a much cleaner syntax, and reduces the amount of "line noise" (AKA special characters) in your code. However, it makes the assumption that your key is going to be a symbol. If you're using any other objects as your keys (numbers, for example), you can't use this syntax.


Storing your hash in a text file has numerous advantages. First, it can be edited without touching your code. Second, it can be sent over the network as a normal file, and you don't have to worry about loading code from a third party. Third, it's used to inter-operate with other software, such as sending Javascript a hash encoded in JSON.
There are two very popular ways to serialize a hash: YAML and JSON.
YAML is "YAML ain't markup language." It's used throughout the Ruby world for a number of things, simple serialization and configuration files being the most common. It has a pretty simple syntax, using whitespace to structure the document. First, what the YAML document looks like. It has a single hash called "languages," which contains 3 key/value pairs.

---
languages:
 ruby: awesome
 python: also awesome
 perl: line noise
And the code to load this.

#!/usr/bin/env ruby
require 'yaml'

yaml = YAML::load( File.open('hash.yaml') )
hash = yaml['languages']

p hash
Notice one change between this and the previous hashes: the keys are strings instead of symbols. Other than that, the hash that it produces is the same as one from a Hash literal.
Next is JSON, another common markup language like YAML. However, it relies less on whitespace for formatting, and is preferred in places over YAML.
{
 "languages": {
    "ruby": "awesome",
    "python": "also awesome",
    "perl": "line noise"
 }
}
And the code to load this hash. Note that this will only work on Ruby 1.9.2.
#!/usr/bin/env ruby
require 'json'

json = JSON.load( File.read('hash.json') )
hash = json['languages']

p hash
Note, again, that strings are used for keys instead of symbols.


Hashes can be built by merging them with other hashes. This is most commonly used to provide a set of default values to hashes, and then overwrite them with a set of desired values without having to define every desired value.
In the following example, the method takes a hash as an argument. It then merges it with the defaults hash. This merging is done by iterating over the argument and, for every key, write that value from the argument hash into the defaults hash. And, like many Ruby methods, there are two variants. The merge method merges the two hashes and returns a third hash, leaving both intact. The merge! method destructively merges the two hashes.

#!/usr/bin/env ruby
require 'pp'

def meth(arg)
 defaults = {
    method: :fast,
    depth: 7,
    delay: 100,
    failure: :ignore
 }

 opts = defaults.merge(arg)
 p opts
end

meth({method: :slow, failure: :retry})
It's often useful to build hashes from an array. This can be done easily using the Hash[ … ] syntax. It'll take an unlimited number of arguments (as long as it's an even number), taking two at a time and using the first as a key, the second as a value. So, for example, Hash[ :key1, 'value', :key2, 'value' ] is equivalent to {:key1 => 'value', :key2 => 'value' }.
This is useful in all manner of situations. Suppose, for example, you're reading some comma-separated values from a text file, and you wish to turn each row into a hash. The first line holds the column names. How do you read this file and turn it into an array of hashes?
First, let's read the first line and zip it with the column indexes. "Zipping" two arrays will combine them. So, if you were to zip ['a', 'b', 'c'] with [1,2,3], you would get [ ['a',1], ['b',2], ['c',3] ]. If you were to then flatten then to produce [ 'a', 1, 'b', 2, 'c', 3 ] and then splat it and pass it to Hash[ … ], you would get a hash equivalent to {:a => 1, :b => 2, :c => 3}. This sounds complicated, but it's not difficult once you start playing with it, and very powerful.
The code is pretty straightforward. The first part just loads the columns names and keeps them for use in the next line. The next line does all the work, and it helps to read it inside-out. For each line, split on commas, zip with the column names, flatten the list, splat it and pass it to the special Hash constructor.
#!/usr/bin/env ruby require 'pp' def readcsv(file) File.open(file) do|f| columns = f.readline.chomp.split(',') f.readlines.map{|l| Hash[ *columns.zip(l.chomp.split(',')).flatten ] } end end pp readcsv('data.csv')

No comments:

Post a Comment