Saturday, 22 November 2008

Ruby File Access

Reading Files
Note that while Ruby accepts backslashes, it seems to prefer forward slashes in file paths, even on Windows.

You can access the contents of a file like this:
file = File.open(filename, 'r')
do_stuff_with file
file.close

To test if the end of the file has been reached, use file.eof?. To test if a file has been closed, use file.closed?.

You can read the entire contents of file in a block:
contents = File.open(filename, 'r') { |file| file.read }

Or just:
contents = File.open('test.txt', 'r').read

If you prefer, you can put each line of the file into an array:
lines = []
File.open(filename, 'rb') { |file|
file.each_line { |line|
lines << line
}
}

Note that file here is exactly the same object as in the very first example. The big advantage of using it in a block is that Ruby will ensure the file is closed for you. You can use each_byte to iterate through the bytes (or, of course, your custom do_stuff_with file method). You can break lines at any character, for example, at the full stops:
lines = []
File.open(filename, 'rb') { |file|
file.each_line('.') { |line|
lines << line
}
}

Alternatively, use IO.foreach:
lines = []
IO.foreach(filename, '.') { |line|
lines << line
}

Or simply use IO.readlines (why was it not named read_lines?):
lines = IO.readlines(filename, '.')


Writing Files
To write data to a file, you can do something like this:
file = File.new("test.txt", "w")
file.syswrite("First line of text")
file.syswrite("Second line of text")
file.close

However, the prefered way, as with reading a file, is to do it in a block, and leave it to Ruby to close the file:
File.open("test.txt", "w") { |file|
file.syswrite("First line of text")
file.syswrite("Second line of text")
}

As well as syswrite, you can also use write, print, puts (adds a return at the end of the line), p (alias for puts) and printf (formated print, just like the C function of the same name). Or you can use the append function:
File.open("test.txt", "w") { |file|
file << "First line of text"
file << "Second line of text"
}


Binary Files
Binaries files are just the same; you just add a b to the second parameter of the open method. This example reads the contents of a binary file to contents (which is a string, by the way), and then writes that to a new binary file:
contents = File.open(in_filename, 'rb').read

File.open(out_filename, 'wb') { |file|
file << contents
}

Manipulating Files
These are all pretty self-explanatory:
File.rename(old_filename, new_filename)
File.exists?(filename)
File.file?(filename)
File.delete(filename)
File.directory?(filename)
File.executable?(filename) # Returns true for a text file!
File.readable?(filename)
File.writable?(filename)
File.zero?(filename)
File.size(filename) # Return 0 if the size is zero
File.size?(filename) # Returns nil if the size is zero
File.ftype(filename)

Using Directories
You can list the contents of a directory at least three way:
Dir.entries(dir)  # Surprisingly, no default for current directory
Dir["#{dir}/*.txt"] #
`dir`

The first and second will produce arrays of the files in the given folders. The first has just the filenames, and starts with entries for "." and "..". The second has the full path for each file, but does have the facility to filter (as in the example, only .txt files will be listed). The third uses a backquoted string to invoke a system command, and will produce a flat string with the same contents that you would see if you typed "dir" at the command prompt (presumably an error on some operating systems).

Other methods:
Dir.pwd                  # Gets the current directory
Dir.chdir('c:') # Changes the current directory (to c:)
Dir.mkdir('newfolder') # Create a new directory
Dir.delete('newfolder') # Or unlink or rmdir


Putting It Together
Here is an actual program. It uses the directory functionality to create an array of file names conforming to the filter (in this case, files end .nif; NetImmersion format), then iterates through the array. For each filename, the file is opened in binary format, the contents read to a string, and then each occurance of one set of strings replaced by another (this was subst, an array of hashes, giving replacement names for texture files). Finally, the file is saved, again in binary format, with a new filename.
files = Dir["#{mesh_dir}/in_r*.nif"]
print "Found #{files.length} files.\n"
Dir.chdir(mesh_dir)
files.each { |f|
contents = File.open(f, 'rb').read
subst.each { |h|
contents.gsub!(h[:old], h[:new])
}
File.open(f.gsub(/In_R/i, 'And'), 'wb') { |file|
file << contents
}
print '.'
}
print "\nDone.\n"


File data in with code?
If you put __END__ in your code, Ruby execution will terminate at that point. Everything beyond that will be considered data, and can be accessed as a file, DATA. Here is a complete program to illustrate that:
p DATA.read

__END__

Your data goes here


Reading a CSV file
This is trivial, as Ruby provides a library to do just this.
require "csv"
values = CSV.read "C:/my-data.csv"

This gives you your data in a two dimensional array.


Struggling with Ruby: Contents Page

1 comment:

Mina said...

Regarding reading an entire file in one chunk, this can't be beat:
File.read "/path/to/file"