Monday, 27 October 2008

YAML

YAML is a format for structuring data in a file, like XML. Rails uses YAML to configure your databases, but it does so in such a way that you hardly notice. However, YAML is pretty neat, and you could be missing out.

YAML stands for "Yet Another Markup Language" (here). Or perhaps for "YAML Ain't Markup Language" (here). I guess it depends on whether you think it is a markup language or not (actually the latter seems the more common one).

There are basically two types of data; that which goes into a hash, and that which goes into an array.
name: value
- value

You can combine them to make complex data structures, such as this array of hashes:
-
name: Tom
age: 32
-
name: Dick
age: 19

And in fact you can also enter arrays and hashes more like Ruby:
- { name: Tom, age: 32}
- { name: Dick, age: 19}

Or as an array of arrays:
- [Tom, 32]
- [Dick, 19]

Comments start with a hash, #, just like Ruby. Ruby/YAML guesses the type from the format. Strings can be surrounded with single-quotes, double-quote (allowing escape sequences) or nothing. Repeated data can be replaced by an "anchor", by labelling the first occurance. Labels begin with an ampersand, and references to the label with an asterix. Use | or > for data that goes on to more than one line. The former preserves line breaks, the latter does not (see the example at the end of the YAML and Ruby section).

This example show all of these, and uses symbols as the keys, rather than string as above.
-
:name: Tom # A string
:age: 32 # An integer
:male: true # A boolean
:born: 1972-02-29 # A date
:address: &add
213 Main Street
Big City
England
:comment:
This is a potentially very
long comment that will be
just one line long.
-
:name: Dick
:age: 19
:address: *add
:comment: 'A very short comment'

See here for more details:
http://yaml.org/spec/current.html

By the way, NetBeans 6.1 flags up strings that go on to multiple lines as errors; this seems to be a bug in NetBeans.

YAML and Ruby
To use the data from your YAML file in a Ruby program, the first step is to load the YAML library:
require 'yaml'

Then load the data:
config = YAML.load(yaml_string)       # From string
config = YAML.load_file("config.yml") # From file

Then you can access to data just as you would any array of hashes (or whatever your data structure):
x = config[0]['age']    # x -> 32

Note that if you want to use symbols as the keys to your hashes the string needs to be prefixed with a colon (as in the examples above).

Going the other way is nearly as easy:
yaml_string = config.to_yaml            # To string
File.open('config.yml', 'w') do |out| # To file
YAML.dump(config, out)
end

Here is an example program that includes the YAML text (as a "here" document), and illustrates how YAML handles text that goes over multiple lines:
require 'YAML'

y = <<YAML_TEXT
:first: >
This is a folded block,
line breaks are discarded
for spaces. The line ends
with a return.
:second: |
This is a literal block,
and so the line-breaks
are preserved. Again the
line ends with a return
:third:
This is a folded block
too, as YAML defaults
to that. However, this time
there is no return at the end.
:fourth: And again a folded
block formated slightly
differently, with no return
at the end.
YAML_TEXT

h = YAML.load(y)
p h[:first]
# => "This is a folded block, line breaks are discarded for spaces. The line ends with a return.\n"
p h[:second]
# => "This is a literal block,\nand so the line-breaks\nare preserved. Again the\nline ends with a return\n"
p h[:third]
# => "This is a folded block too, as YAML defaults to that. However, this time there is no return at the end."
p h[:fourth]
# => "And again a folded block formated slightly differently, with no return at the end."

You can combine several data structures into a single file. Each should start with three dashes on a line on their own to indicate the start of a new document. Use the load_stream method to open the file. For example:
  data_doc = YAML::load_stream(File.open('data.yml'))
NAME_SCHEMAS = data_doc.documents[0]
DATA_TYPES = data_doc.documents[1]
CATEGORIES = data_doc.documents[2]


YAML and Rails
You can use a YAML file to kick start your database table in Rails. Rails already uses YAML, so no need for the require. The YAML file goes in the root directory for your web application.

One way that I have seen is to put the code inside your migration file, and when you migrate, the data goes straight into your table. However, this is a short term solution only. What happens in your production database or for testing? You could clone your database, but the prefered Rail way is to use Rake to generate the database tables, based on what is in db/schema.rb, and that will not have any initialisation data.

What I have done (and it may not be the best way) is to write setup methods in my models, which can be invoked from a console session. In this example, for a model called Role, defining the roles a user can have
  def self.setup_roles
return false unless Role.find(:all).length == 0
Role.create(:rolename => 'administrator')
Role.create(:rolename => 'manager')
true
end

Note that the method fitrst tests to see if there are any roles present already, and only adds the default roles if not. In your tests you can invoke the setup method before a test to load in the data, and be sure that it is the same default data as in your development and production databases.

Struggling with Ruby: Contents Page

3 comments:

naitsirch said...

Thanks for this article. It explained very well how to use multiple lines!
If you would provide a FLATTR button I would use it ;-)

samdc said...

this saved me. Thanks so much!

Abuzze said...

After reading 3 other post about how to parse yaml, I finally found the answer here. Thank you so much.