Thursday 1 April 2010

Testing for Valid HTML

There is a gem for testing valid HTML called RailsTidy, however, when I tried to use it I got "RuntimeError: can't find the symbol `tidyCreate' ", which seems to relate to calling non-Ruby code (possibly failing because I am using JRuby?). So anyway, I looked at doing my own.

The best way to approach this is to create a method that works like the usual assert_x methods, so it can be invoked like this:
def test_should_get_index
get :index
assert_response :success
assert_not_nil assigns(:posts)
assert_validates
end

The skeleton of my method is going to look like this:
def assert_validates message = ''
clean_backtrace do
msg = build_message(message, "Invalid HTML found")
assert_block(msg) do
# Code here
end
end
end

In common with other assert_x methods, it will accept an optional message. The clean_backtrace method does what it says. It catches AssertionFailedErrors, cleans the backtrace, and rethrows the error. I am using it to ensure backtraces from my method are consistent with the backtrace from other such methods.

Inside that block, I use the assert_block method. This is the workhorse of all assertions; it catches and counts failures and errors, based on what happens inside its block. If it returns true, the test passes. One limitation here is that it seems impossible to modify the message from within the block, so we can report back what the problem was.

Now to check the HTML is valid. I am going to cheat here, and actually check that it is valid XML, as then I can palm the work off on REXML (remember the require "rexml/document").
begin
REXML::Document.new @response.body
true
rescue REXML::ParseException => ex
false
end

The code attempts to create an XML document from the body of the response. The block returns true if this was fine, and false if REXML threw an exception. You could put a print statement in there tto indicate where the problem is; I prefered to run any offending page through a proper validator (eg here or href="http://htmlhelp.com/cgi-bin/validate.cgi">here) once a problem is found, as it tells you exactly what it is.

Here is the whole thing, which should be inside ActiveSupport::TestCase, in test_helper.rb. I have added a constant, so you can turn validation on or off.
require "rexml/document"

# Do you want to validate?
VALIDATE = true

def assert_validates message = ''
# Do not bother if validation is turned off
return unless VALIDATE
# Do not bother unless it is actually HTML
return unless @response.content_type == "text/html"

clean_backtrace do
assert_block(build_message(message, "Invalid HTML found")) do
begin
REXML::Document.new @response.body
true
rescue REXML::ParseException => ex
puts ex
false
end
end
end
end

You can check a whole bunch of tests in one file by adding a teardown method, like this.
def teardown
assert_validates
end

If an action results in a redirect, this will only test that the redirect directive is valid HTML, not that the page the user is sent to is. That page is not generated during a functional test (though it will be if another action returns that page, and you test that action).

You could put the assert_validates method into a module, and include that module in both ActiveSupport::TestCase and ActionController::Integration::Session. This would allow you to validate in your integration tests too. I am not sure that that actually has any benefit; you would seem to be testing the same thing twice.


Struggling with Ruby: Contents Page

No comments: