Thursday 11 March 2010

Rails: Singularize and Pluralize

Rails has functions buit-in that will turn a word into its plural or singular. It uses this when generating models, etc. to create names that conform to the standard (so user.rb but users_controller.rb). The rules are set up in a file called:
...\lib\ruby\gems\1.8\gems\activesupport-2.3.2\lib\active_support\inflections.rb

The action all happened inside a block like this:
module ActiveSupport
Inflector.inflections do |inflect|
# definitions
end
end

There are four types of rules. The first sets a general rule for making a plural, like this:
inflect.plural(/([^aeiouy]|qu)y$/i, '\1ies')

If the regular expression in the first parameter matches, then it is replaced by the the second parameter (but note that a capture group is used, so a part of the discarded ending is still used). This rule will match a word ending in y, but not preceded by a vowel, replacing the "y" with "ies"

The second sets a rule for making a singluar, like this (which is the reverse of the previous):
inflect.singular(/([^aeiouy]|qu)ies$/i, '\1y')

Then there are the irregulars, defined like this:
inflect.irregular('person', 'people')

And those that do not change, like this:
inflect.uncountable(%w(equipment information rice money species series fish sheep))

Unfortunately, it is not perfect (not in 2.3.3 anyway), which is why I was obliged to learn about it. For example:
inflect.plural(/(octop|vir)us$/i, '\1i')

The plural of virus is viruses, not viri; octopus can use either form, though octopuses is prefered (see here).

The particular one I ran up against was this, for "metal analyses":
inflect.singular(/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i, '\1\2sis')
inflect.singular(/(^analy)ses$/i, '\1sis')

The system runs through the rules starting with the last defined, until it finds a match. For "analyses", it hits the second of the lines shown above, and returns "analysis" (why this was not put in as an irregular I cannot imagine). There is no match there for "metal analysis" as the pattern specifies the start of the string. So it looks at the previous rule. Now a match is found. However, this match defines several capture groups, the first is "analy", the second is just the "a", and both these are used in the replacement, so the resulting plural goes like this:
<whatever was before the pattern> <"analy"> <"a"> <"sis">

"metal analyses -> metal analyasis

Because the replacement uses only the first and second capture groups, the other words in that rule work fine. For "theses", for example, the "t" is in capture group 8, which is not used. And as analysis on its own gets caught by the previous rule, the error can easily be missed.

This issue was bought up as a bug, but dismissed (see here). The simple workaround is to define your own rule. I have done this in a file inside config/initializers (I have called mine called initial.rb).
ActiveSupport::Inflector.inflections do |inflect|
inflect.singular(/(analy|ba|diagno|parenthe|progno|synop|the)ses$/i, '\1sis')
end

You can add as many of your own rules as you like. As they get added later, they will take precedence over the existing rules.


Struggling with Ruby: Contents Page