Monday, 23 March 2009

The Model Part 5 - Find and other ActiveRecord methods

The find method
ActiveRecord provides a lot of functionality through the find method and its derivatives. Here are some examples:
Post.find 45
# Retrieves the post with id=45
Post.find 2, 4, 5
# Gets the posts with id equal to 2, 4 and 5 (as an array)
Post.find :first
# Retrieves the first post in the database
Post.find :last
# Retrieves the last post in the database
Post.find :all
# Retrieves all the posts (as an array)
Post.find_by_author 'F2Andy'
# Retrieves the first post in the database authored by F2Andy
Post.find_all_by_author 'F2Andy'
# Retrieves all the posts in the database authored by F2Andy
Post.find_all_by_author_and_title 'F2Andy', 'ActiveRecord find'
Post.find :all, conditions => { :title => 'ActiveRecord find',
:author => 'F2Andy'}
# Both these retrieves all the posts in the database with
# the given author and title

There are various options you can include in your search, as illustrated here, which gets the 21st through to 30th posts, ordered by the title:
Post.find_all_by_author 'F2Andy', :order => 'title',
:offset => 20, :limit => 10

You can determine whether the ordering goes up or down using ASC and DESC (defaults to ASC). The offset is ignored if limit is not present.

When you know what you are searching for, ActiveRecord has it covered. However, often we do not. Do a search of Google, and you are looking for one or more words within the text. To do that for ActiveRecord you need to delve into SQL a little.

Dipping a Toe into SQL
Rails helps you to some degree, and to see how, let us perform that last search again.
Post.find :all, :conditions => ['author = ? AND title = ?',
'F2Andy', 'ActiveRecord find']

Notice the :conditions key. This maps to an array. The first element is the base string, some SQL code including question marks. All the other elements are substituted for the question marks in order. The SQL command that is used will be something like:
SELECT * FROM Post WHERE author = "F2Andy" AND
title = "ActiveRecord find"

You can use symbols rather than question marks. It is a little more long-winded, but makes it clear what goes where, and you can go in any order, so would be preferable in all but the simplest of cases.
Post.find :all,
:conditions => ['author = :author AND title = :title?',
:author => 'F2Andy', :title => 'ActiveRecord find'}]

ActiveRecord does some kind of sanitising to protect against SQL injection attacks. While the following will work, it is a bad idea, as it will bypass that process:
Post.find :all,
:conditions => 'author = "F2Andy" AND title = "ActiveRecord find"'
Post.find_by_sql 'SELECT * FROM Post WHERE author = "F2Andy" AND title = "ActiveRecord find"'

However, for complex searches, this might be your only option (an interesting page on that can be seen here).

Using Wildcards
So now we are ready to search for a fragment within a field. In SQL, you use the LIKE keyword, rather than the equals sign, and use % as a wildcard. Let us suppose I have a database of literature references, and I am searching for one author, Smith.
refs = Ref.find :all,
:conditions => [ "authors LIKE ?", "%Smith%" ]

This will return any record where "Smith" appears in the authors field.

Case Insensitive
There is no standard for doing case sensitive/insensitive seaches in SQL (and no help from Rails either), nor any standard about which should be the default (indeed, that depends on how the database is set up).

On PostgreSQL, you can use ILIKE to do case insensitive searches:
refs = Ref.find :all,
:conditions => [ "authors ILIKE ?", "%smith%" ]

On MySQL, you can change the collaton method (this seems to be the usual SQL strategy, however as each database system has its own collations, it still varies between databases; SQL Server uses SQL_Latin1_General_CP1_CI_AS I think).
refs = Ref.find :all,
:conditions => [ "authors LIKE ? COLLATE utf8_general_ci",
"%smith%" ]

What this all means is that the only safe way is to convert to all lower (or upper) case. The only way I could get this to work was to change the search term before hand (possibly due to how Rails sanitises the SQL).
term = "Smith".downcase
refs = Ref.find :all,
:conditions => [ "LOWER(authors) LIKE ?", "%#{term}%" ]

Complicated searches
So what if we want to search for muliple words in multiple fields? What you need to do is build a search string that each term with be put into, and an array of terms. Here is an addition to ActiveRecord::Base that will do just that.
class ActiveRecord::Base
# Retrieves all ActiveRecords that contain the
# user supplied keywords.
# The hash parameter should contain column name
# mappings to strings of keywords. For example:
# refs = :authors => 'smith jones',
# :body => 'ruby search'
# This will retrieve any record containing both
# "smith" and "jones" in the authors field, and
# "ruby" and "search" in the body field.
# The search is case insensitive.
# Note that simply sending the params hash from
# the controller will not work, as this includes
# values for :action and :controller.
def params
# This will becomes the base for the SQL string
sql_fragments = []
# This is the array of search terms. The first
#entry is just a place holder; this will be
# replaced by the SQL string at the end.
search_terms = ['']
params.each_pair do |k, v|
v.split.each do |e|
sql_fragments << "LOWER(#{k.to_s}) LIKE ?"
search_terms << "%#{e.downcase}%"
# Now assemble the SQL string and put it at
# the start of the array of terms.
search_terms[0] = sql_fragments.join(' AND ')
find :all, :conditions => search_terms

Other methods

The delete and destroy methods
These take an id or an array of ids to delete a set of records, returning the number of records set. The destroy method actually creates a new instance, populates it with data from the table, then calls the destroy method on the object, so will be slower, but will ensure any custom code in the destroy method and any callbacks and filters is invoked.
Post.delete 19
Post.destroy [23, 67, 103]

The delete_all and destroy_all methods
These are similar to find :all, and will accept the same sort of conditions. As with delete and destroy, destroy_all is slower as the objects are instantiated first.
Post.delete_all :conditions => [ "user_name = ? AND category = ?",
user_name, category ]

The exists? method
This method returns true if one or more records matching the condition exists. Note that it accepts either an id, or the condition itself (rather than :condition mapping to the condition)
Post.exists? 45
Post.exists? :user_name => user_name, :category => category
Post.exists? ["user_name = ? AND category = ?",
user_name, category]

As well as retrieving records, you can perform calculations, using average, maximum, minimum and sum. Examples from the API:
Person.calculate(:count, :all) # The same as Person.count
Person.average(:age) # SELECT AVG(age) FROM people...
:conditions => ['last_name != ?', 'Drake'])
# Selects the minimum age for everyone with a last name other
# than 'Drake'
Person.minimum(:age, :having => 'min(age) > 17',
:group => :last_name)
# Selects the minimum age for any family without any minors
Person.sum("2 * age")

You can also use count. Again, examples from the API:
# returns the total count of all people
# returns the total count of all people
# whose age is present in database
Person.count(:conditions => "age > 26")
Person.count(:conditions => "age > 26 AND job.salary > 60000",
:include => :job)
# because of the named association, it finds the DISTINCT
# count using LEFT OUTER JOIN.
Person.count(:conditions => "age > 26 AND job.salary > 60000",
:joins => "LEFT JOIN jobs on jobs.person_id =")
# finds the number of rows matching the conditions and joins.
Person.count('id', :conditions => "age > 26")
# Performs a COUNT(id)
Person.count(:all, :conditions => "age > 26")
# Performs a COUNT(*) (:all is an alias for '*')

See also:


Struggling with Ruby: Contents Page

Wednesday, 18 March 2009

Ruby dates and times

Ruby has three classes for handling time, Date, DateTime and Time.

Time is part of the core language, while Date and DateTime are part of standard Ruby; to use Date and DateTime you will need to load in date.rb. Here is some code that initialises each with the current date/time.
t =
# => Fri Feb 06 08:56:27 +0000 2009
require 'date' # Needed for Date and DateTime
# => true
d =
# => #<Date: 4909737/2,0,2299161>
dt =
# => #<DateTime: 14140044704711/5760000,0,2299161>

As you can see, despite its name, the Time class holds the date as well as the time. The inspect method of Time gives a convenient output; not so Date or DateTime. However, the to_s method is a little more useful. The best option is to use the strftime method for all of them, which gives you full control over how time and dates are formated.
p t.inspect
# => "Fri Feb 06 08:56:27 +0000 2009"
p d.inspect
# => "#"
p t.to_s
# => "Fri Feb 06 08:56:27 +0000 2009"
p d.to_s
# => "2009-02-06"
p dt.to_s
# => "2009-02-06T08:56:10+00:00"
p d.strftime('%H%M on %d/%b/%y')
# => "0000 on 06/Feb/09"
p t.strftime('%H%M on %d/%b/%y')
# => "0856 on 06/Feb/09"
p dt.strftime('%H%M on %d/%b/%y')
# => "0856 on 06/Feb/09"

The full list of options (from here):
  %a - The abbreviated weekday name ("Sun")
%A - The full weekday name ("Sunday")
%b - The abbreviated month name ("Jan")
%B - The full month name ("January")
%c - The preferred local date and time representation
%d - Day of the month (01..31)
%H - Hour of the day, 24-hour clock (00..23)
%I - Hour of the day, 12-hour clock (01..12)
%j - Day of the year (001..366)
%m - Month of the year (01..12)
%M - Minute of the hour (00..59)
%p - Meridian indicator ("AM" or "PM")
%S - Second of the minute (00..60)
%U - Week number of the current year,
starting with the first Sunday as the first
day of the first week (00..53)
%W - Week number of the current year,
starting with the first Monday as the first
day of the first week (00..53)
%w - Day of the week (Sunday is 0, 0..6)
%x - Preferred representation for the date alone, no time
%X - Preferred representation for the time alone, no date
%y - Year without a century (00..99)
%Y - Year with century
%Z - Time zone name
%% - Literal "%" character

You can also create Date, Time and DateTime objects using the new method. Date and DateTime default to midnight on the 1st of January 1988, while Time defaults to the current date and time. With Date and DateTime you can set specific values. The parameter list starts with the biggest units, years, and gets smaller. Successive arguments are optional.
p'%H%M on %d/%b/%y')
# => "0000 on 01/Jan/88"
p'%H%M on %d/%b/%y')
# => "0000 on 01/Jan/88"
p'%H%M on %d/%b/%y')
# => "0000 on 01/Jan/06"
p, 4).strftime('%H%M on %d/%b/%y')
# => "0000 on 01/Apr/06"
p, 4, 7).strftime('%H%M on %d/%b/%y')
# => "0000 on 07/Apr/06"
p, 4, 7).strftime('%H%M on %d/%b/%y')
# => "0000 on 07/Apr/06"
p, 4, 7, 8).strftime('%H%M on %d/%b/%y')
# => "0800 on 07/Apr/06"
p, 4, 7, 8, 23).strftime('%H%M on %d/%b/%y')
# => "0823 on 07/Apr/06"

Using date and time examples.
d2 = d1 >> 2  # d2 will be two months later than d1
d2 = d1 << 2 # d2 will be two months earlier than d1
d.wday # Day of week, Monday = 1
d.yday # Day of the year # Time zone
d.leap? # Leap year?

You can determine the different between two dates just be taking one from the other. The complication is that the result is a Rational.
d1 = 2004
d2 = 2005
d2 - d1
# => Rational(366, 1)

A Rational object consists of two numbers. Basically it is a fraction; the first number goes on top, the second number of the bottom (in mathematics, a rational number is one that can be expressed as a faction with finite digits; as opposed to, for example, pi, which is an irrational number). The number of days between the 1st January 2004 and 2005 is 366 divided by 1. You can freely use Date and DateTime objects together; the result is always the number of days expressed as a fraction, as a Rational object. You can convert your Rational object to an array cotaining the hours, minutes, second and the second fraction with Date.day_fraction_to_time.

According to here, Time is written in C, and is therefore some 20 times faster than Date/DateTime. However, it can only handle dates from 1970 to 2039 (Unix epoch)

Humanize time/date:

Struggling with Ruby: Contents Page

Sunday, 15 March 2009


In Ruby, variables are always private; they cannot be directly accessed from outside the object (constants are always public). However, it is often the case that you do need to allow some access. One way would be to define methods that set and get the values. For example:
class TestClass
def initialize id, name
@id = id
@name = name

def id

def name

def name= s
@name = s

# Test it works properly
tc = 12, 'Boris'
p = 'Alfie'

Note that the id cannot be set after the object is created; it is a read-only attribute.

Ruby offers a short cut for getters and setters. The above class can be re-written like this:
class TestClass
def initialize id, name
@id = id
@name = name

attr_reader :id
attr_accessor :name

The class behaves just the same, so the test code will work here as well, but all that clutter has been removed.

There is also a method for write-only attributes, and several attributes can be listed, separated with commas:
attr_reader :size, :address, :dir

What is happening is that attr_reader is a method (in the Module class), that takes the parameter :id, and dynamically defines the id method.

Having said that, here is an interesting article (written for Java, but applicable to any object-orientated language) about why getters and setters are evil (sometimes):

Struggling with Ruby: Contents Page

Saturday, 14 March 2009


The Ruby interpreter will take anything that has a name beginning with a capital letter as being a constant.
# Variables
n = 67
s = 'my string'

N = 89
TITLE = 'My great program'
class MyClass end

Not really constant?
Variables and constants are really pointers to objects. This has some practical consequences that may be unexpected. Let us set some up:
S = s = 'My string'
N = n = 23
X = x = 12.6

Then we can see what happens what we modify the object:
s << ' is longer'
n += 4
x += 3.5

Perhaps the odd thing here is that modifying s also modifies the constant, S. But is that so odd? Both s and S point to the same string. Modify the string, and naturally what they both point to has changed. So what is surprising is that the others have not changed (as an aside, this is also the situation in Java, but in Java, numbers are primitives, not objects; in Ruby everything is an object). Numbers have one object each to represent each value (though they will not all exist in te virtual machine t any one time of course). Change the value, and it will point to a new instance that stands for the new value.

Note that for the string, I used << rather than +=. Although they both concatenate strings, The former changes the existing string, while the latter creates a new string. If I had used += in the above, s would change to point to the new string, while S would still point to the old string.

So you can readily modify the object that a constant points to, but you cannot change what object it points to... Can you? Well, yes you can. The only issue is that the interpreter gives a warning. Here is an IRb session:
irb(main):011:0> Constant = 'My string'
=> "My string"
irb(main):012:0> Constant = 56
(irb):12: warning: already initialized constant Constant
=> 56

It turns out that constants are only really constant by convention, and there is nothing to stop you changing them and no guarantee that they will remain the same (just as setting a method as private is no guarantee that it cannot be invoked from anywhere).

Struggling with Ruby: Contents Page

Saturday, 7 March 2009

Ruby blocks

A block is a chunk of code. What is great in Ruby is the way they can be passed as an argument to a method. For example, that is what is happening here:
(0..10).each { |x|
p x

The (0..10) is a Range object, with a method, each. The method is passed the block (everything between the curly braces). Blocks can be defined with do and end, rather than curly braces (but note that the precedence is different).
(0..10).each do |x|
p x

Implicitly passing blocks
Here is an example of passing a block to a custom method, test. Note that test does not mention the block at all, so the block is considered to have been passed implicitly. Where the block is used is with the yield statement.
def test
p yield(5, 'house')
p yield(100, 'mansion')

test { |i, s|
puts "You are in the block #{i} #{s}"
"Returning #{i} #{s}"

# Output:

# You are in the block 5 house
# "Returning 5 house"
# You are in the block 100 mansion
# "Returning 100 mansion"

So what is going on? The test method is invoked, and passed a block of code. Inside the test method, Ruby iterates though each line until it reaches a yield statement, and when it does, it runs the code block passed to the method. The yield statement is kind of like a method call, in that you can pass it arguments (and it will insist on the right number of arguments), and it can return a value too. You can think of yield as an alias for your block, so in the above example, yield is sent a number and a string, and returns a new string.

Naturally the block will be run every time Ruby encounters a yield statement, which is twice in the above example.

Iterating with blocks
Okay, now it gets a bit hairy, and web pages that actually address this become correspondingly rare...

How do you get the method that is receiving a block to iterate over an Array or Hash? In this example, a method, test, is added to the class Array, then an Array object is ceated and test is invoked.
class Array
def test
total = 0
each { |x|
p yield(5 + x)
total += x

ma = [12, 34, 8]
p ma.test { |y|
p y
y - 5

# Output:

# 17
# 12
# 39
# 34
# 13
# 8
# 54

The first point to note is the each statement and its associated block. This is what allows us to iterate over the Array (or Hash). The each statement sets up a variable, x. This will take the value of each member of the array in turn, as normal. On the next line there is the yield statement. This invokes the block that was received, sending it the current value from the array, plus five.

The block is set up to accept a single value, and to call it y, which it then prints. It then returns this value minus five.

Back with the yield statement, and the test method prints the returned value. Then it adds the current value from the array, x, to the variable total. Once the loop finishes (all the array members have been done), total is used as a return value for the test method.

Back outside the test method, the returned value is printed.

Here are more useful examples for the Array class.

class Array
# Allows you to loop over an array, accessing
# both the index and the value
def each_pair
each_index { |i| yield i, fetch(i) }

# As the each method, but skips the first element
def each_not_first
each_index { |i| yield fetch(i) unless i == 0 }

# Returns a total over each element in the array
# where the value for an element is determined
# by the given block.
def total &prc
val = 0
each { |e| val += }

# Returns an element that best fits the criteria
# given by the block.
# Note that I have used yield here, it seems to work
# better if you have more than one parameter;
# "warning: multiple values for a block parameter..."
def find_best &prc
best = first
each_not_first { |e| best = e if yield(best, e) }

# Example array
ary = [
{:name => 'one', :value => 56},
{:name => 'two', :value => 79},
{:name => 'three', :value => -5},

# This uses each_pair to print both the index,
# and the name of the item.
ary.each_pair { |index, item| "#{index}: #{item[:name]}" }

# This example uses total to add up the
# values of each element.
p { |e| e[:value] }

# This one uses find to get the element with the
# highest value.
p ary.find { |x, y| x[:value] < y[:value] }

Explicitly passing blocks
If you want to be able handle the block other than through yield, you need to pass it explicitly. All this involves is listing it in the arguments. Note that the block must be last in the list, and has to be preceded by an ampersand (but the ampersand should not be present when used later in your code). When you do this, the block is converted to a Proc object (which I discussed here; note that you cannot pass a Proc object in lieu of a block).

def test &prc
puts "The block is of the #{prc.class} class"
puts'This') unless prc.nil?

test { |s| "#{s} does nothing" }


# Output:

# The block is of the Proc class
# This does nothing
# The block is of the NilClass class

Apparently, this is significantly slower than using implicit passing. The block is invoked with call; arguments to that are passed to the block.

Note that you cannot set a default value for a block in the arguments of method, however, as shown above, if no block is given the variable will be set to nil.

Many languages have a 'with' statement (Visual BASIC and Pascal). This example adds that functionality to Ruby, and also compares explicit and implicit passing:
# Define a complicated data struction
data = { :ary => %w(one two three four) }

# Define a method, 'use', for all objects
# This version uses explicit passing
class Object
def use &prc self

# This version uses implicit passing
class Object
def use
yield self

# invoke 'use' on a specific member of the data structure
data[:ary][2].use do |x|
# Do stuff with x, rather than data[:ary][2]
p "The number is #{x}"

See also:

Struggling with Ruby: Contents Page