Generators and Coroutines ========================= In purely functional programming, when we call a function it returns a value:: user=> (defn incr [n] (+ n 1)) #'user/incr user=> (incr 5) 6 user=> (incr 2) 3 You can call a function as many times as you like, and each time the same thing happens: the function is called, it computes a value, it returns this value, and then it ends. In purely functional programming, functions have no memory of any previous function calls. We can generalize ordinary function calling as follows. Instead of a function *returning* a value, we could instead allow it to *report* a value without returning. A function could report the result of some calculation, and then just keep going doing more calculations. This turns out to be a very practical and useful idea. A function that behaves like this is no longer really a function, but instead is generally called a **coroutine**. Generators in Python -------------------- Python_ is a language that supports coroutines, and here we'll look at a kind of coroutine called a **generator**. Generators are a convenient way to create iterators. An **iterator** is an object with, at least, a function called ``next()`` that returns a value every time you call it. A well-designed iterator only does work when ``next()`` is called, so that, for instance, if you want only, say, the first 3 values from an iterator, then you just call ``next()`` three times. Later values don't need to be calculated. For example, this generator creates an iterator that returns three different greetings:: def gen_hello(): yield 'Hi!' yield 'Hello!' yield "How's it going?" You could use it like this:: >>> greet = gen_hello() >>> greet.next() 'Hi!' >>> greet.next() 'Hello!' >>> greet.next() "How's it going?" >>> greet.next() Traceback (most recent call last): File "", line 1, in StopIteration Or with a for-loop like this:: >>> greet = gen_hello() >>> for g in greet: print g ... Hi! Hello! How's it going? Or:: >>> for g in gen_hello(): print g ... Hi! Hello! How's it going? Infinite Loops in Generators ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Infinite loops are often useful in generators, e.g:: # generates all even ints, starting at 0 def gen_evens(): e = 0 while True: # infinite loop! yield e e += 2 >>> evens = gen_evens() >>> evens.next() 0 >>> evens.next() 2 >>> evens.next() 4 >>> evens.next() 6 The iterator created by the generator only runs when we call ``next()``, so there is no infinite loop here --- unless we call ``next()`` an infinite number of times, e.g.:: for e in evens: # infinite loop! print e A nice feature of this style of programming is that it separates a loop from its stopping condition. ``gen_evens`` knows how to generate even numbers, but it doesn't know when to stop. The code that uses ``gen_evens`` is responsible for deciding when to stop calling it. If you want, say, the first 10 even numbers, then you should call ``evens.next()`` 10 times. We could do this in general like this:: # generates, at most, the first n values that iter yields def gen_first(iter, n): i = 0 while i < n: yield iter.next() i += 1 Example: Mutli-value Linear Search ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here is a Python_ generator that returns an iterator that does linear search for ``x`` in some sequence. Each call to ``next()`` on the iterator returns the next index ``i`` such that ``seq[i] == x``:: # a home-made version of Python's standard enumerate generator # if iter returns a, b, c, ... # counted(iter) returns (0, a), (1, b), (2, c), ... def counted(iter): i = 0 for x in iter: yield i, x i += 1 def gen_multi_search(x, seq): for i, val in counted(seq): if x == val: yield i >>> for i in gen_multi_search('a', 'abba llama'): print i ... 0 3 7 9 >>> for i in gen_multi_search('x', 'abba llama'): print i ... >>> The ``yield`` statement emits the value ``i`` when the iterators ``next()`` is called, and then waits there, not executing any other code, until ``next()`` is called again. Example: A Lexical Scanner ~~~~~~~~~~~~~~~~~~~~~~~~~~ Here's a generator that creates a tokenizer (i.e. a lexical scanner) for arithmetic expressions:: def gen_token(s): i = 0 while i < len(s): if s[i] in ' \n\r\t': pass # do nothing: ignore whitespace elif s[i] in '+-*/()': yield s[i] elif s[i] in '0123456789': num = s[i] i += 1 while i < len(s) and s[i] in '0123456789': num += s[i] i += 1 i -= 1 yield num i += 1 def test(): input = ' ( 1 - 34389) + 34' print 'input = "%s"' % input gen = gen_token(input) for token in gen: print '"%s"' % token >>> test() input = " ( 1 - 34389) + 34" "(" "1" "-" "34389" ")" "+" "34" Every time ``next()`` is called on the iterator, it runs until it hits a ``yield``, and then returns the yield value. Then the iterator stops and waits: it does nothing until ``next()`` is called again. So if you never call ``next()`` again, it does no more work. Example: Python's itertools Library ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``itertools`` is a standard collection of useful iterator-related functions. For instance, ``itertools`` can generate all permutations of a sequence like this:: >>> from itertools import * >>> for x in permutations('abc'): print x ... ('a', 'b', 'c') ('a', 'c', 'b') ('b', 'a', 'c') ('b', 'c', 'a') ('c', 'a', 'b') ('c', 'b', 'a') Or, if you would like the `Cartesian product `_ a set with itself:: >>> for x in product('abc', repeat=2): print x ... ('a', 'a') ('a', 'b') ('a', 'c') ('b', 'a') ('b', 'b') ('b', 'c') ('c', 'a') ('c', 'b') ('c', 'c') If you want all bit strings of length 3, you could generate them like this:: >>> for x in product('01', repeat=3): print x ... ('0', '0', '0') ('0', '0', '1') ('0', '1', '0') ('0', '1', '1') ('1', '0', '0') ('1', '0', '1') ('1', '1', '0') ('1', '1', '1') Change the 3 to a 4, and you'll get all bit strings of length 4:: >>> for x in product('01', repeat=4): print x ... ('0', '0', '0', '0') ('0', '0', '0', '1') ('0', '0', '1', '0') ('0', '0', '1', '1') ('0', '1', '0', '0') ('0', '1', '0', '1') ('0', '1', '1', '0') ('0', '1', '1', '1') ('1', '0', '0', '0') ('1', '0', '0', '1') ('1', '0', '1', '0') ('1', '0', '1', '1') ('1', '1', '0', '0') ('1', '1', '0', '1') ('1', '1', '1', '0') ('1', '1', '1', '1') You can even pass in other generators to these functions. For example:: >>> for x in product(gen_hello(), repeat=2): print x ... ('Hi!', 'Hi!') ('Hi!', 'Hello!') ('Hi!', "How's it going?") ('Hello!', 'Hi!') ('Hello!', 'Hello!') ('Hello!', "How's it going?") ("How's it going?", 'Hi!') ("How's it going?", 'Hello!') ("How's it going?", "How's it going?") Or even:: >>> for x in permutations(gen_hello()): print x ... ('Hi!', 'Hello!', "How's it going?") ('Hi!', "How's it going?", 'Hello!') ('Hello!', 'Hi!', "How's it going?") ('Hello!', "How's it going?", 'Hi!') ("How's it going?", 'Hi!', 'Hello!') ("How's it going?", 'Hello!', 'Hi!')