Generators and Coroutines¶
In purely functional programming, when we call a function it returns a value:
user=> (defn incr [n] (+ n 1))
#'user/incr
user=> (incr 5)
6
user=> (incr 2)
3
You can call a function as many times as you like, and each time the same thing happens: the function is called, it computes a value, it returns this value, and then it ends. In purely functional programming, functions have no memory of any previous function calls.
We can generalize ordinary function calling as follows. Instead of a function returning a value, we could instead allow it to report a value without returning. A function could report the result of some calculation, and then just keep going doing more calculations.
This turns out to be a very practical and useful idea. A function that behaves like this is no longer really a function, but instead is generally called a coroutine.
Generators in Python¶
Python is a language that supports coroutines, and here we’ll look at a kind
of coroutine called a generator. Generators are a convenient way to create
iterators. An iterator is an object with, at least, a function called
next()
that returns a value every time you call it. A well-designed
iterator only does work when next()
is called, so that, for instance, if
you want only, say, the first 3 values from an iterator, then you just call
next()
three times. Later values don’t need to be calculated.
For example, this generator creates an iterator that returns three different greetings:
def gen_hello():
yield 'Hi!'
yield 'Hello!'
yield "How's it going?"
You could use it like this:
>>> greet = gen_hello()
>>> greet.next()
'Hi!'
>>> greet.next()
'Hello!'
>>> greet.next()
"How's it going?"
>>> greet.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Or with a for-loop like this:
>>> greet = gen_hello()
>>> for g in greet: print g
...
Hi!
Hello!
How's it going?
Or:
>>> for g in gen_hello(): print g
...
Hi!
Hello!
How's it going?
Infinite Loops in Generators¶
Infinite loops are often useful in generators, e.g:
# generates all even ints, starting at 0
def gen_evens():
e = 0
while True: # infinite loop!
yield e
e += 2
>>> evens = gen_evens()
>>> evens.next()
0
>>> evens.next()
2
>>> evens.next()
4
>>> evens.next()
6
The iterator created by the generator only runs when we call next()
, so
there is no infinite loop here — unless we call next()
an infinite
number of times, e.g.:
for e in evens: # infinite loop!
print e
A nice feature of this style of programming is that it separates a loop from
its stopping condition. gen_evens
knows how to generate even numbers, but
it doesn’t know when to stop. The code that uses gen_evens
is responsible
for deciding when to stop calling it.
If you want, say, the first 10 even numbers, then you should call
evens.next()
10 times. We could do this in general like this:
# generates, at most, the first n values that iter yields
def gen_first(iter, n):
i = 0
while i < n:
yield iter.next()
i += 1
Example: Mutli-value Linear Search¶
Here is a Python generator that returns an iterator that does linear search
for x
in some sequence. Each call to next()
on the iterator returns
the next index i
such that seq[i] == x
:
# a home-made version of Python's standard enumerate generator
# if iter returns a, b, c, ...
# counted(iter) returns (0, a), (1, b), (2, c), ...
def counted(iter):
i = 0
for x in iter:
yield i, x
i += 1
def gen_multi_search(x, seq):
for i, val in counted(seq):
if x == val:
yield i
>>> for i in gen_multi_search('a', 'abba llama'): print i
...
0
3
7
9
>>> for i in gen_multi_search('x', 'abba llama'): print i
...
>>>
The yield
statement emits the value i
when the iterators next()
is
called, and then waits there, not executing any other code, until next()
is called again.
Example: A Lexical Scanner¶
Here’s a generator that creates a tokenizer (i.e. a lexical scanner) for arithmetic expressions:
def gen_token(s):
i = 0
while i < len(s):
if s[i] in ' \n\r\t':
pass # do nothing: ignore whitespace
elif s[i] in '+-*/()':
yield s[i]
elif s[i] in '0123456789':
num = s[i]
i += 1
while i < len(s) and s[i] in '0123456789':
num += s[i]
i += 1
i -= 1
yield num
i += 1
def test():
input = ' ( 1 - 34389) + 34'
print 'input = "%s"' % input
gen = gen_token(input)
for token in gen:
print '"%s"' % token
>>> test()
input = " ( 1 - 34389) + 34"
"("
"1"
"-"
"34389"
")"
"+"
"34"
Every time next()
is called on the iterator, it runs until it hits a
yield
, and then returns the yield value. Then the iterator stops and
waits: it does nothing until next()
is called again. So if you never call
next()
again, it does no more work.
Example: Python’s itertools Library¶
itertools
is a standard collection of useful iterator-related functions.
For instance, itertools
can generate all permutations of a sequence like
this:
>>> from itertools import *
>>> for x in permutations('abc'): print x
...
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
Or, if you would like the Cartesian product a set with itself:
>>> for x in product('abc', repeat=2): print x
...
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'b')
('b', 'c')
('c', 'a')
('c', 'b')
('c', 'c')
If you want all bit strings of length 3, you could generate them like this:
>>> for x in product('01', repeat=3): print x
...
('0', '0', '0')
('0', '0', '1')
('0', '1', '0')
('0', '1', '1')
('1', '0', '0')
('1', '0', '1')
('1', '1', '0')
('1', '1', '1')
Change the 3 to a 4, and you’ll get all bit strings of length 4:
>>> for x in product('01', repeat=4): print x
...
('0', '0', '0', '0')
('0', '0', '0', '1')
('0', '0', '1', '0')
('0', '0', '1', '1')
('0', '1', '0', '0')
('0', '1', '0', '1')
('0', '1', '1', '0')
('0', '1', '1', '1')
('1', '0', '0', '0')
('1', '0', '0', '1')
('1', '0', '1', '0')
('1', '0', '1', '1')
('1', '1', '0', '0')
('1', '1', '0', '1')
('1', '1', '1', '0')
('1', '1', '1', '1')
You can even pass in other generators to these functions. For example:
>>> for x in product(gen_hello(), repeat=2): print x
...
('Hi!', 'Hi!')
('Hi!', 'Hello!')
('Hi!', "How's it going?")
('Hello!', 'Hi!')
('Hello!', 'Hello!')
('Hello!', "How's it going?")
("How's it going?", 'Hi!')
("How's it going?", 'Hello!')
("How's it going?", "How's it going?")
Or even:
>>> for x in permutations(gen_hello()): print x
...
('Hi!', 'Hello!', "How's it going?")
('Hi!', "How's it going?", 'Hello!')
('Hello!', 'Hi!', "How's it going?")
('Hello!', "How's it going?", 'Hi!')
("How's it going?", 'Hi!', 'Hello!')
("How's it going?", 'Hello!', 'Hi!')