Exercise 8.2
(a) A Simple Generator
If you ever find yourself wanting to customize iteration, you should
always think generator functions. They’re easy to write---simply make
a function that carries out the desired iteration logic and uses yield
to emit values.
For example, try this generator that searches a file for lines containing a matching substring:
>>> def filematch(filename, substr):
f = open(filename, 'r')
for line in f:
if substr in line:
yield line
f.close()
>>> for line in open('Data/portfolio.csv'):
print line,
name,shares,price
"AA",100,32.20
"IBM",50,91.10
"CAT",150,83.44
"MSFT",200,51.23
"GE",95,40.37
"MSFT",50,65.10
"IBM",100,70.44
>>> for line in filematch('Data/portfolio.csv', 'IBM'):
print line,
"IBM",50,91.10
"IBM",100,70.44
>>>
This is kind of interesting—the idea that you can hide a bunch of custom processing in a function and use it to feed a simple for-loop. The next example looks at a more unusual case.
(b) Monitoring a streaming data source
Generators can be an interesting way to monitor real-time data sources such as log files or stock market feeds. In this part, we’ll explore this idea. To start, follow the next instructions carefully.
The program Data/stocksim.py
is a program that
simulates stock market data. As output, the program constantly writes
real-time data to a file stocklog.csv
. In a
command window (not IDLE) go into the Data/
directory and run this program:
% python stocksim.py
If you are on Windows, just locate the stocksim.py
program and
double-click on it to run it. Now, forget about this program (just
let it run). Using another window, look at the file
Data/stocklog.csv
being written by the simulator. You should see
new lines of text being added to the file every few seconds. Again,
just let this program run in the background---it will run for several
hours (you shouldn’t need to worry about it).
Once the above program is running, let’s write a little program to
open the file, seek to the end, and watch for new output. Create a
file follow.py
and put this code in it:
# follow.py
import os
import time
f = open('Data/stocklog.csv')
f.seek(0, os.SEEK_END) # Move file pointer 0 bytes from end of file
while True:
line = f.readline()
if line == '':
time.sleep(0.1) # Sleep briefly and retry
continue
fields = line.split(',')
name = fields[0].strip('"')
price = float(fields[1])
change = float(fields[4])
if change < 0:
print '%10s %10.2f %10.2f' % (name, price, change)
If you run the program, you’ll see a real-time stock ticker. Under the covers,
this code is kind of like the Unix tail -f
command that’s used to watch a log file.
(c) Using a generator to produce data
If you look at the code in part (b), the first part of the code is producing
lines of data whereas the statements at the end of the while
loop are consuming
the data. A major feature of generator functions is that you can move all
of the data production code into a reusable function.
Modify the code in part (b) so that the file-reading is performed by
a generator function follow(filename)
. Make it so the following code
works:
>>> for line in follow('Data/stocklog.csv'):
print line,
... Should see lines of output produced here ...
Modify the stock ticker code so that it looks like this:
if __name__ == '__main__':
for line in follow('Data/stocklog.csv'):
fields = line.split(',')
name = fields[0].strip('"')
price = float(fields[1])
change = float(fields[4])
if change < 0:
print '%10s %10.2f %10.2f' % (name, price, change)