[book of ruby]Chapter 5: Loops and Iterators
原作者:http://www.verydemo.com/demo_c119_i3030.html
Chapter 5: Loops and Iterators
Much of programming is concerned with repetition. Maybe you want your program to beep ten times, read lines from a file just so long as there are more lines to read or display a warning until the user presses a key. Ruby provides a number of ways of performing this kind of repetition.
FOR LOOPS
In many programming languages, when you want to run a bit of code a certain number of times you can just put it inside a for loop. In most languages, you give a for loop a variable initialized with a starting value which is incremented by 1 on each turn through the loop until it meets some specific ending value. When the ending value is met, the for loop stops running. Here's a version of this traditional type of for loop written in Pascal:
(* This is Pascal code, not Ruby! *) for i := 1 to 3 do writeln( i );
You may recall from the last chapter that Ruby's for loop doesn't work like this at all! Instead of giving it a starting and ending value, we give the for loop a list of items and it iterates over them, one by one, assigning each value in turn to a loop variable until it gets to the end of the list.
For example, here is a for loop that iterates over the items in an array, displaying each in turn:
# This is Ruby code... for i in [1,2,3] do puts( i ) end
The for loop is more like the 'for each' iterator provided by some other pro-gramming languages. The items over which the loop iterates don't have to be integers. This works just as well...
for s in ['one','two','three'] do puts( s ) end
The author of Ruby describes for as 'syntax sugar' for the each method which is implemented by collection types such as Arrays, Sets, Hashes and Strings (a String being, in effect, a collection of characters). For the sake of comparison, this is one of the for loops shown above rewritten using the each method:
[1,2,3].each do |i| puts( i ) end
As you can see, there isn't really all that much difference. To convert the for loop to an each iterator, all I've had to do is delete for and in and append .each to the array. Then I've put the iterator variable, i, between a pair of upright bars after do. Compare these other examples to see just how similar for loops are to each iterators:
# --- Example 1 --- # i) for for s in ['one','two','three'] do puts( s ) end # ii) each ['one','two','three'].each do |s| puts( s ) end # --- Example 2 --- # i) for for x in [1, "two", [3,4,5] ] do puts( x ) end # ii) each [1, "two", [3,4,5] ].each do |x| puts( x ) end
Note, incidentally, that the do keyword is optional in a for loop that spans multiple lines but it is obligatory when it is written on a single line:
# Here the "do" keyword can be omitted for s in ['one','two','three'] puts( s ) end # But here it is required for s in ['one','two','three'] do puts( s ) end
How to write a 'normal' for loop...
If you miss the traditional type of for loop, you can always 'fake' it in Ruby by using a for loop to iterate over the values in a range. For ex-ample, this is how to use a for loop variable to count up from 1 to 10, displaying its value at each turn through the loop:
for i in (1..10) do puts( i ) end
This example shows how both for and each can be used to iterate over the values in a range:
# for for s in 1..3 puts( s ) end # each (1..3).each do |s| puts(s) end
Note, incidentally, that a range expression such as 1..3 must be enclosed between round brackets when used with the each method, otherwise Ruby assumes that you are attempting to use each as a method of the final integer (a Fixnum) rather than of the entire expression (a Range). The brackets are optional when a range is used in a for loop.
MULTIPLE ITERATOR ARGUMENTS You may recall that in the last chapter we used a for loop with more than one loop variable. We did this in order to iterate over a multi-dimensional array. On each turn through the for loop, a variable was assigned one row (that is, one 'sub-array') from the outer array:
# Here multiarr is an array containing two "rows" # (sub-arrays) at index 0 and 1 multiarr = [ ['one','two','three','four'], [1,2,3,4] ] # This for loop runs twice (once for each "row" of multiarr) for (a,b,c,d) in multiarr print("a=#{a}, b=#{b}, c=#{c}, d=#{d}\n" ) end
The above loop prints this:
a=one, b=two, c=three, d=four a=1, b=2, c=3, d=4
We could use the each method to iterate over this four-item array by passing four 'block parameters' - a, b, c, d - into the block delimited by do and end at each iteration:
multiarr.each do |a,b,c,d| print("a=#{a}, b=#{b}, c=#{c}, d=#{d}\n" ) end
Block Parameters
In Ruby the body of an iterator is called a 'block' and any variables declared between upright bars at the top of a block are called 'block parameters'. In a way, a block works like a function and the block pa-rameters work like a function's argument list. The each method runs the code inside the block and passes to it the arguments supplied by a collection (such as the array, multiarr). In the example above, the each method repeatedly passes an array of four elements to the block and those elements initialize the four block parameters, a, b, c, d. Blocks can be used for other things, in addition to iterating over col-lections. I'll have more to say on blocks in Chapter 10.
BLOCKS Ruby has an alternative syntax for delimiting blocks. Instead of using do..end, you can use curly braces {..} like this:
# do..end [[1,2,3],[3,4,5],[6,7,8]].each do |a,b,c| puts( "#{a}, #{b}, #{c}" ) end # curly braces {..} [[1,2,3],[3,4,5],[6,7,8]].each{ |a,b,c| puts( "#{a}, #{b}, #{c}" ) }
No matter which block delimiters you use, you must ensure that the opening delimiter, '{' or 'do", is placed on the same line as the each method. Inserting a line break between each and the opening block delimiter is a syntax error.
WHILE LOOPS
Ruby has a few other loop constructs too. This is how to do a while loop:
while tired sleep end
Or, to put it another way:
sleep while tired
Even though the syntax of these two examples is different they perform the same function. In the first example, the code between while and end (here a call to a method named sleep) executes just as long as the Boolean condition (which, in
this case, is the value returned by a method called tired) evaluates to true. As in for loops the keyword do may optionally be placed between the test condition and the code to be executed when these appear on separate lines; the do keyword is obligatory when the test condition and the code to be executed appear on the same line.
WHILE MODIFIERS
In the second version of the loop (sleep while tired), the code to be executed (sleep) precedes the test condition (while tired). This syntax is called a 'while modifier'. When you want to execute several expressions using this syntax, you can put them between the begin and end keywords:
begin sleep snore end while tired
This is an example showing the various alternative syntaxes:
$hours_asleep = 0 def tired if $hours_asleep >= 8 then $hours_asleep = 0 return false else $hours_asleep += 1 return true end end def snore puts('snore....') end def sleep puts("z" * $hours_asleep ) end
while tired do sleep end # a single-line while loop
while tired # a multi-line while loop sleep end sleep while tired # single-line while modifier begin # multi-line while modifier sleep snore end while tired
The last example above (the multi-line while modifier) needs close consideration as it introduces some important new behaviour. When a block of code delimited by begin and end precedes the while test, that code always executes at least once. In the other types of while loop, the code may never execute at all if the Boolean condition initially evaluates to true.
Ensuring a Loop Executes At Least Once
Usually a while loops executes 0 or more times since the Boolean test is evaluated before the loop executes; if the test returns false at the outset, the code inside the loop never runs.
However, when the while test follows a block of code enclosed be-tween begin and end, the loop executes 1 or more times as the Boo-lean expression is evaluated after the code inside the loop executes.
To appreciate the differences in behaviour of these two types of while loop, run 2loops.rb.
These examples should help to clarify:
x = 100 # The code in this loop never runs while (x < 100) do puts('x < 100') end # The code in this loop never runs puts('x < 100') while (x < 100) # But the code in loop runs once begin puts('x < 100') end while (x < 100)
UNTIL LOOPS
Ruby also has an until loop which can be thought of as a 'while not' loop. Its syntax and options are the same as those applying to while - that is, the test condition and the code to be executed can be placed on a single line (in which case the do keyword is obligatory) or then can be placed on separate lines (in which case do is optional).
There is also an until modifier which lets you put the code before the test condi-tion and an option to enclose the code between begin and end in order to ensure that the code block is run at least once.
Here are some simple examples of until loops:
i = 10
until i == 10 do puts(i) end # never executes
until i == 10 # never executes puts(i) i += 1 end puts(i) until i == 10 # never executes begin # executes once puts(i) end until i == 10
Both while and until loops can, just like a for loop, be used to iterate over arrays and other collections. For example, this is how to iterate over all the elements in an array:
while i < arr.length puts(arr[i]) i += 1 end until i == arr.length puts(arr[i]) i +=1 end
LOOP The examples in 3loops.rb should all look pretty familiar - with the exception of the last one:
loop { puts(arr[i]) i+=1 if (i == arr.length) then break end }
This uses the loop method repeatedly to execute the block enclosed by curly braces. This is just like the iterator blocks we used earlier with the each method. Once again, we have a choice of block delimiters - either curly braces or do and end:
puts( "\nloop" ) i=0 loop do puts(arr[i]) i+=1 if (i == arr.length) then break end end
This code iterates through the array, arr, by incrementing a counter variable, i, and breaking out of the loop when the (i == arr.length) condition evaluates to true. You have to break out of a loop in this way since, unlike while or until, the loop method does not evaluate a test condition to determine whether or not to continue looping. Without a break it would loop forever.
Digging Deeper
Hashes, Arrays, Ranges and Sets all include a Ruby module called Enumerable. A module is a sort of code library (I'll have more to say about modules in Chap-ter 12). In Chapter 4, I used the Comparable module to add comparison methods such as < and > to an array. You may recall that I did this by subclassing the Array class and 'including' the Comparable module into the subclass:
class Array2 < Array include Comparable end
THE ENUMERABLE MODULE The Enumerable module is already included into the Ruby Array class and it provides arrays with a number of useful methods such as include? which returns true if a specific value is found in an array, min which returns the smallest value, max which returns the largest and collect which creates a new array made up of values returned from a block:
arr = [1,2,3,4,5] y = arr.collect{ |i| i } #=> y = [1, 2, 3, 4] z = arr.collect{ |i| i * i } #=> z = [1, 4, 9, 16, 25] arr.include?( 3 ) #=> true arr.include?( 6 ) #=> false arr.min #=> 1 arr.max #=> 5
These same methods are available to other collection classes just as long as those classes include Enumerable. Hash is such a class. Remember, however, that the items in a Hash are not indexed in sequential order so when you use the min and max methods these return the items that are lowest and highest according to
their numerical value - here the items are strings and the numerical value is determined by the ASCII codes of the characters in the key.
CUSTOM COMPARISONS
But let's suppose you would prefer min and max to return items based on some other criterion (say the length of a string)? The easiest way to do this would be to define the nature of the comparison inside a block. This is done in a similar manner to the sorting blocks I defined in Chapter 4. You may recall that we sorted a Hash (here the variable h) by passing a block to the sort method like this:
h.sort{ |a,b| a.to_s <=> b.to_s }
The two parameters, a and b, represent two items from the Hash which are compared using the <=> comparison method. We can similarly pass blocks to the max and min methods:
h.min{ |a,b| a[0].length <=> b[0].length } h.max{|a,b| a[0].length <=> b[0].length }
When a Hash passes items into a block it does so in the form of arrays, each of which contains a key-value pair. So, if a Hash contains items like this.
{"one"=>"for sorrow", "two"=>"for joy"}
a = ["one","for sorrow"] b = ["two","for joy"]
This explains why the two blocks in which I have defined custom comparisons for the max and min methods specifically compare the first elements, at index 0, of the two block parameters:
a[0].length <=> b[0].length
This ensures that the comparisons are based on the keys in the Hash.
If you want to compare the values rather than the keys, just set the array indexes to 1:
p( h.min{|a,b| a[1].length <=> b[1].length } ) p( h.max{|a,b| a[1].length <=> b[1].length } )
You could, of course, define other types of custom comparisons in your blocks. Let's suppose, for example, that you want the strings 'one', 'two', 'three' and so on, to be evaluated in the order in which we would speak them. One way of doing this would be to create an ordered array of strings:
str_arr=['one','two','three','four','five','six','seven']
Now, if a Hash, h, contains these strings as keys, a block can use str_array as a reference in order to determine the minimum and maximum values:
h.min{|a,b| str_arr.index(a[0]) <=> str_arr.index(b[0])} #=> ["one", "for sorrow"] h.max{|a,b| str_arr.index(a[0]) <=> str_arr.index(b[0])} #=> ["seven", "for a secret never to be told"]
All the examples above, use the min and max methods of the Array and Hash classes. Remember that these methods are provided to those classes by the Enumerable module.
There may be occasions when it would be useful to be able to apply Enumerable methods such as max, min and collect to classes which do not descend from existing classes (such as Array) which implement those methods. You can do that by including the Enumerable module in your class and then writing an iterator method called each like this:
class MyCollection include Enumerable def initialize( someItems ) @items = someItems end def each @items.each{ |i| yield( i ) } end end
Here you could initialize a MyCollection object with an array, which will be stored in the instance variable, @items. When you call one of the methods provided by the Enumerable module (such as min, max or collect) this will, 'behind the scenes', call the each method in order to obtain each piece of data one at a time.
Now you can use the Enumerable methods with your MyCollection objects:
things = MyCollection.new(['x','yz','defgh','ij','klmno']) p( things.min ) #=> "defgh" p( things.max ) #=> "yz" p( things.collect{ |i| i.upcase } ) #=> ["X", "YZ", "DEFGH", "IJ", "KLMNO"]
You could similarly use your MyCollection class to process arrays such as the keys or values of Hashes. Currently the min and max methods adopt the default behaviour of performing comparisons based on numerical values so 'xy' will be considered to be 'higher' than 'abcd' on the basis of the characters' ASCII values. If you want to perform some other type of comparison - say, by string length, so
that 'abcd' would be deemed to be higher than 'xz' - you can just override the min and max methods:
def min @items.to_a.min{|a,b| a.length <=> b.length } end def max @items.to_a.max{|a,b| a.length <=> b.length } end
Each and Yield...
....................................................................
def each @items.each{ |i| yield( i ) } end
The keyword yield is a special bit of Ruby magic which here tells the code to run the block that was passed to the each method - that is, to run the code supplied by the Enumerator module's min, max or col-lect methods. This means that the code of those methods can be used with all kinds of different types of collection. All you have to do is,
i) include the Enumerable module into your class and
ii) write an each method which determines which values will be used by the Enumer-able methods.