Saturday 24 April 2010

Operator Overloading

Ruby permits operator overloading, allowing you to define how operators will work with your own (or indeed any) classes. Here is a quick example, showing how to define the addition operator. Note that you get the += for free.
class Tester1
def initialize x
@x = x
end

def +(y)
@x + y
end
end

a = Tester1.new 5
puts(a + 3)
# => 8
a += 7
puts a
# => 12

The definition is not commutative, i.e., trying to do 3 + a would fail. To get that to work you would need to override the addition method in Integer - and I think that would be a bad idea.

The next example shows how to override comparison operators. Note that by overriding the equality operator, you get the inequality operator for free. Also note that you are not restricted to returning a boolean; in this example <= returns a string.
class Tester2
def initialize ary
@ary = ary
end

attr_reader :ary

def ==(y)
@ary.length == y.ary.length
end

def <(y)
@ary.length < y.ary.length
end
def <=(y)
'@ary.length < y.ary.length'
end
end

b = Tester2.new %w(zero one two three)
puts b == Tester2.new([1, 2, 3, 4, 5])
# => false
puts b != Tester2.new([1, 2, 3, 4, 5])
# => true
puts b < Tester2.new([1, 2, 3, 4, 5])
# => true
puts b <= Tester2.new([1, 2, 3, 4, 5])
# => "@ary.length < y.ary.length"

You need to be careful when overriding operators, as it has the potential to make your code unfathomable. One time I think it is particular useful, however, is to add array-like properties to a class, for example to access an underlying array, as in this example. Note how the += operator has to be defined via the + operator. Also, the index operator needs two methods if you want to be able to both read from and write to it.
class Tester3
def initialize ary
@ary = ary
end

def [](y)
@ary[y]
end

def []=(y, value)
@ary[y] = value
end

def <<(y)
@ary << y
end

def +(y)
@ary << y
end
end

c = Tester3.new %w(zero one two three)
puts c[3]
# => three
c << 'four'
puts c[4]
# => four
c += 'five'
puts c[5]
# => five

As an aside, be aware that << and += are not the same for an Array object (though they are for Tester3). The += operator requires an array, the members of which are added to the existing array. The << operator appends the new object to the array; if the new object is an array, you have the new array nested inside the existing array.

You cannot override keywords, such as "or", nor can you override:
&& & || | () {} :: . ~ .. ...



Struggling with Ruby: Contents Page

Tuesday 20 April 2010

XML and Ruby

When you come back to XML after using YAML, it is a real pain in the neck. However, sometime we have to do it.

REXML offers a comprehensive set of functions for negotiating an XML document, so this is what I used to read an XML document into a Ruby data structure.

First, the setting up. Load in the REXML library, and include it for convenience. Load in the XML file.
require "rexml/document"
include REXML

file = File.new("materials.xml")
doc = REXML::Document.new file

In my XML, materials in the root node, and this has a number of children, msds, which in turn have a number of children, material. Each material node has a set of attributes, plus some child nodes of its own.

Each element has an attributes attribute and an elements attribute. You can iterate though these using each, but you can also select specific nodes by sending a path to the each method. I want to start by iterating through the msds nodes:
doc.elements.each("materials/msds") do |msds|
# do stuff
end

For each iteration though the loop, I then need to go though that elements nodes with an inener loop.

For the material nodes, I need to go though the attributes. The each method of attributes has two parameters for the block, the name and value. Easy to add these to a hash (material_data).
material.attributes.each do |name, value|
material_data[name.to_sym] = value
end

To extract specific values from a node, into a hash, I did this:
h = { :name => element.text,
:type => element.name,
:file => element.attributes["file"] }

The text method gets the inner text from the element, name gives the tag name, and attributes["file"] gets the value of the "file" attribute.



require "rexml/document"
include REXML

file = File.new("materials.xml")
doc = REXML::Document.new file

data = []

doc.elements.each("materials/msds") do |msds|
msds_data = []
msds.elements.each("material") do |material|
material_data_ary = []
material.elements.each do |element|
h = { :name => element.text,
:type => element.name,
:file => element.attributes["file"] }

material_data_ary << material_data =" {" msds =""> material_data_ary }
material.attributes.each {|name, value| material_data[name.to_sym] = value }

msds_data << msds =""> msds_data, :file => msds.attributes["file"] }
end


REXML API
http://www.germane-software.com/software/rexml/doc/
Tutorial
http://www.germane-software.com/software/rexml/docs/tutorial.html
Further
http://www.developer.com/lang/article.php/3672621


Struggling with Ruby: Contents Page

Thursday 1 April 2010

Testing for Valid HTML

There is a gem for testing valid HTML called RailsTidy, however, when I tried to use it I got "RuntimeError: can't find the symbol `tidyCreate' ", which seems to relate to calling non-Ruby code (possibly failing because I am using JRuby?). So anyway, I looked at doing my own.

The best way to approach this is to create a method that works like the usual assert_x methods, so it can be invoked like this:
def test_should_get_index
get :index
assert_response :success
assert_not_nil assigns(:posts)
assert_validates
end

The skeleton of my method is going to look like this:
def assert_validates message = ''
clean_backtrace do
msg = build_message(message, "Invalid HTML found")
assert_block(msg) do
# Code here
end
end
end

In common with other assert_x methods, it will accept an optional message. The clean_backtrace method does what it says. It catches AssertionFailedErrors, cleans the backtrace, and rethrows the error. I am using it to ensure backtraces from my method are consistent with the backtrace from other such methods.

Inside that block, I use the assert_block method. This is the workhorse of all assertions; it catches and counts failures and errors, based on what happens inside its block. If it returns true, the test passes. One limitation here is that it seems impossible to modify the message from within the block, so we can report back what the problem was.

Now to check the HTML is valid. I am going to cheat here, and actually check that it is valid XML, as then I can palm the work off on REXML (remember the require "rexml/document").
begin
REXML::Document.new @response.body
true
rescue REXML::ParseException => ex
false
end

The code attempts to create an XML document from the body of the response. The block returns true if this was fine, and false if REXML threw an exception. You could put a print statement in there tto indicate where the problem is; I prefered to run any offending page through a proper validator (eg here or href="http://htmlhelp.com/cgi-bin/validate.cgi">here) once a problem is found, as it tells you exactly what it is.

Here is the whole thing, which should be inside ActiveSupport::TestCase, in test_helper.rb. I have added a constant, so you can turn validation on or off.
require "rexml/document"

# Do you want to validate?
VALIDATE = true

def assert_validates message = ''
# Do not bother if validation is turned off
return unless VALIDATE
# Do not bother unless it is actually HTML
return unless @response.content_type == "text/html"

clean_backtrace do
assert_block(build_message(message, "Invalid HTML found")) do
begin
REXML::Document.new @response.body
true
rescue REXML::ParseException => ex
puts ex
false
end
end
end
end

You can check a whole bunch of tests in one file by adding a teardown method, like this.
def teardown
assert_validates
end

If an action results in a redirect, this will only test that the redirect directive is valid HTML, not that the page the user is sent to is. That page is not generated during a functional test (though it will be if another action returns that page, and you test that action).

You could put the assert_validates method into a module, and include that module in both ActiveSupport::TestCase and ActionController::Integration::Session. This would allow you to validate in your integration tests too. I am not sure that that actually has any benefit; you would seem to be testing the same thing twice.


Struggling with Ruby: Contents Page