Exercise 7.2
(a) Adding logging to a module
Logging is relatively easy to add to your code. In this first
part, go back to your fileparse.py
library that was used to
parse datafiles. In that library, there was a parse_csv()
function that looked like this (note: copy this code if your
previous solution was broken):
# fileparse.py
import csv
def parse_csv(filename, select=None, types=None, has_headers=True, delimiter=',', ignore_errors=False):
'''
Parse a CSV file into a list of records with type conversion.
'''
if select and not has_headers:
raise RuntimeError('select requires column headers')
f = open(filename)
f_csv = csv.reader(f, delimiter=delimiter)
# Read the file headers (if any)
headers = next(f_csv) if has_headers else []
# If specific columns have been selected, make indices for filtering and set output columns
if select:
indices = [ headers.index(colname) for colname in select ]
output_columns = select
else:
indices = []
output_columns = headers
records = []
for rowno, row in enumerate(f_csv, 1):
if not row: # Skip rows with no data
continue
# If specific column indices are selected, pick them out
if indices:
row = [ row[index] for index in indices]
# Apply type conversion to the row
if types:
try:
row = [func(val) for func, val in zip(types, row)]
except ValueError as e:
if not ignore_errors:
print "Row %d: Couldn't convert %s" % (rowno, row)
print "Row %d: Reason %s" % (rowno, e)
continue
# Make a dictionary or a tuple
if output_columns:
record = dict(zip(output_columns, row))
else:
record = tuple(row)
records.append(record)
f.close()
return records
In this implementation, ValueError
exceptions are caught and warning
messages output using print
. Instead of doing that, modify the code
to log bad lines of input. To do this, modify your fileparse.py
code so that it looks like this:
# fileparse.py
import csv
# Get a logger on which to issue diagnostics. The __name__ variable
# contains the module name--so in this case the logger should have
# the name 'fileparse'
import logging
log = logging.getLogger(__name__)
def parse_csv(filename, select=None, types=None, has_headers=True, delimiter=',', ignore_errors=False):
'''
Parse a CSV file into a list of records with type conversion.
'''
if select and not has_headers:
raise RuntimeError('select requires column headers')
f = open(filename)
f_csv = csv.reader(f, delimiter=delimiter)
# Read the file headers (if any)
headers = next(f_csv) if has_headers else []
# If specific columns have been selected, make indices for filtering and set output columns
if select:
indices = [ headers.index(colname) for colname in select ]
output_columns = select
else:
indices = []
output_columns = headers
records = []
for rowno, row in enumerate(f_csv, 1):
if not row: # Skip rows with no data
continue
# If specific column indices are selected, pick them out
if indices:
row = [ row[index] for index in indices]
# Apply type conversion to the row
if types:
try:
row = [func(val) for func, val in zip(types, row)]
except ValueError as e:
if not ignore_errors:
log.warning("Row %d: Couldn't convert %s", rowno, row)
log.debug("Row %d: Reason %s", rowno, e)
continue
# Make a dictionary or a tuple
if output_columns:
record = dict(zip(output_columns, row))
else:
record = tuple(row)
records.append(record)
f.close()
return records
Now that you’ve made these changes, try using your module interactively.
>>> import fileparse
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
No handlers could be found for logger "fileparse"
>>>
The warning message about handlers means that a logging message was issued, but the logging module wasn’t configured. Type these steps to do that and actually see the warning messages:
>>> import logging
>>> logging.basicConfig()
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
>>>
You will notice that you don’t see the output from the log.debug()
operation. By
default, logging only outputs messages that have a level of WARNING
or higher. Type this to
change the level.
>>> logging.getLogger('fileparse').level = logging.DEBUG
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
DEBUG:fileparse:Row 4: Reason invalid literal for int() with base 10: ''
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
DEBUG:fileparse:Row 7: Reason invalid literal for int() with base 10: ''
>>>
Turn off all, but the most critical logging messages:
>>> logging.getLogger('fileparse').level=logging.CRITICAL
>>> a = fileparse.parse_csv('Data/missing.csv', [str,int,float])
>>>
(b) Adding Logging to a Program
To add logging to an application, you need to have some mechanism to initialize the logging module in the main module. A simple way to do this is to include some setup code that looks like this:
# This file simply sets up basic configuration of the logging module.
# Change settings here to adjust logging output as needed.
import logging
logging.basicConfig(
filename = 'report.log', # Name of the log file (omit to use stderr)
filemode = 'w', # File mode (use 'a' to append)
level = logging.WARNING, # Logging level (DEBUG, INFO, WARNING, ERROR, or CRITICAL)
)
Make this modification to report.py
and have it write to a logfile
called report.log
. Experiment with this modified version. For
example, you should be able to run the program like this:
bash % python report.py Data/missing.csv Data/prices.csv
Name Shares Price Change
---------- ---------- ---------- ----------
AA 100 9.22 -22.98
IBM 50 106.28 15.18
CAT 150 35.46 -47.98
GE 95 13.48 -26.89
MSFT 50 20.89 -44.21
bash % cat report.log
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
bash %
Commentary:
The logging module has a large number of advanced features and configuration options. More information can be found in practical-python/Optional/Logging.pdf.