Tuesday, 20 April 2010

XML and Ruby

When you come back to XML after using YAML, it is a real pain in the neck. However, sometime we have to do it.

REXML offers a comprehensive set of functions for negotiating an XML document, so this is what I used to read an XML document into a Ruby data structure.

First, the setting up. Load in the REXML library, and include it for convenience. Load in the XML file.
require "rexml/document"
include REXML

file = File.new("materials.xml")
doc = REXML::Document.new file

In my XML, materials in the root node, and this has a number of children, msds, which in turn have a number of children, material. Each material node has a set of attributes, plus some child nodes of its own.

Each element has an attributes attribute and an elements attribute. You can iterate though these using each, but you can also select specific nodes by sending a path to the each method. I want to start by iterating through the msds nodes:
doc.elements.each("materials/msds") do |msds|
# do stuff
end

For each iteration though the loop, I then need to go though that elements nodes with an inener loop.

For the material nodes, I need to go though the attributes. The each method of attributes has two parameters for the block, the name and value. Easy to add these to a hash (material_data).
material.attributes.each do |name, value|
material_data[name.to_sym] = value
end

To extract specific values from a node, into a hash, I did this:
h = { :name => element.text,
:type => element.name,
:file => element.attributes["file"] }

The text method gets the inner text from the element, name gives the tag name, and attributes["file"] gets the value of the "file" attribute.



require "rexml/document"
include REXML

file = File.new("materials.xml")
doc = REXML::Document.new file

data = []

doc.elements.each("materials/msds") do |msds|
msds_data = []
msds.elements.each("material") do |material|
material_data_ary = []
material.elements.each do |element|
h = { :name => element.text,
:type => element.name,
:file => element.attributes["file"] }

material_data_ary << material_data =" {" msds =""> material_data_ary }
material.attributes.each {|name, value| material_data[name.to_sym] = value }

msds_data << msds =""> msds_data, :file => msds.attributes["file"] }
end


REXML API
http://www.germane-software.com/software/rexml/doc/
Tutorial
http://www.germane-software.com/software/rexml/docs/tutorial.html
Further
http://www.developer.com/lang/article.php/3672621


Struggling with Ruby: Contents Page

No comments: