Exercise 7.2

Objectives:

  • Learn how to use the logging module to record diagnostics

Files Modified:

  • fileparse.py

  • report.py

(a) Adding logging to a module

Logging is relatively easy to add to your code. In this first part, go back to your fileparse.py library that was used to parse datafiles. In that library, there was a parse_csv() function that looked like this (note: copy this code if your previous solution was broken):

# fileparse.py
import csv

def parse_csv(filename, select=None, types=None, has_headers=True, delimiter=',', ignore_errors=False):
    '''
    Parse a CSV file into a list of records with type conversion.
    '''
    if select and not has_headers:
        raise RuntimeError('select requires column headers')

    f = open(filename)
    f_csv = csv.reader(f, delimiter=delimiter)

    # Read the file headers (if any)
    headers = next(f_csv) if has_headers else []

    # If specific columns have been selected, make indices for filtering and set output columns
    if select:
        indices = [ headers.index(colname) for colname in select ]
        output_columns = select
    else:
        indices = []
        output_columns = headers

    records = []
    for rowno, row in enumerate(f_csv, 1):
        if not row:     # Skip rows with no data
            continue

        # If specific column indices are selected, pick them out
        if indices:
            row = [ row[index] for index in indices]

        # Apply type conversion to the row
        if types:
            try:
                row = [func(val) for func, val in zip(types, row)]
            except ValueError as e:
                if not ignore_errors:
                    print "Row %d: Couldn't convert %s" % (rowno, row)
                    print "Row %d: Reason %s" % (rowno, e)
                continue

        # Make a dictionary or a tuple
        if output_columns:
            record = dict(zip(output_columns, row))
        else:
            record = tuple(row)
        records.append(record)

    f.close()
    return records

In this implementation, ValueError exceptions are caught and warning messages output using print. Instead of doing that, modify the code to log bad lines of input. To do this, modify your fileparse.py code so that it looks like this:

# fileparse.py
import csv

# Get a logger on which to issue diagnostics.  The __name__ variable
# contains the module name--so in this case the logger should have
# the name 'fileparse'

import logging
log = logging.getLogger(__name__)

def parse_csv(filename, select=None, types=None, has_headers=True, delimiter=',', ignore_errors=False):
    '''
    Parse a CSV file into a list of records with type conversion.
    '''
    if select and not has_headers:
        raise RuntimeError('select requires column headers')

    f = open(filename)
    f_csv = csv.reader(f, delimiter=delimiter)

    # Read the file headers (if any)
    headers = next(f_csv) if has_headers else []

    # If specific columns have been selected, make indices for filtering and set output columns
    if select:
        indices = [ headers.index(colname) for colname in select ]
        output_columns = select
    else:
        indices = []
        output_columns = headers

    records = []
    for rowno, row in enumerate(f_csv, 1):
        if not row:     # Skip rows with no data
            continue

        # If specific column indices are selected, pick them out
        if indices:
            row = [ row[index] for index in indices]

        # Apply type conversion to the row
        if types:
            try:
                row = [func(val) for func, val in zip(types, row)]
            except ValueError as e:
                if not ignore_errors:
                    log.warning("Row %d: Couldn't convert %s", rowno, row)
                    log.debug("Row %d: Reason %s", rowno, e)
                continue

        # Make a dictionary or a tuple
        if output_columns:
            record = dict(zip(output_columns, row))
        else:
            record = tuple(row)
        records.append(record)

    f.close()
    return records

Now that you’ve made these changes, try using your module interactively.

>>> import fileparse
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
No handlers could be found for logger "fileparse"
>>>

The warning message about handlers means that a logging message was issued, but the logging module wasn’t configured. Type these steps to do that and actually see the warning messages:

>>> import logging
>>> logging.basicConfig()
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
>>>

You will notice that you don’t see the output from the log.debug() operation. By default, logging only outputs messages that have a level of WARNING or higher. Type this to change the level.

>>> logging.getLogger('fileparse').level = logging.DEBUG
>>> a = fileparse.parse_csv('Data/missing.csv', types=[str,int,float])
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
DEBUG:fileparse:Row 4: Reason invalid literal for int() with base 10: ''
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
DEBUG:fileparse:Row 7: Reason invalid literal for int() with base 10: ''
>>>

Turn off all, but the most critical logging messages:

>>> logging.getLogger('fileparse').level=logging.CRITICAL
>>> a = fileparse.parse_csv('Data/missing.csv', [str,int,float])
>>>

(b) Adding Logging to a Program

To add logging to an application, you need to have some mechanism to initialize the logging module in the main module. A simple way to do this is to include some setup code that looks like this:

# This file simply sets up basic configuration of the logging module.
# Change settings here to adjust logging output as needed.
import logging
logging.basicConfig(
    filename = 'report.log',         # Name of the log file (omit to use stderr)
    filemode = 'w',               # File mode (use 'a' to append)
    level    = logging.WARNING,   # Logging level (DEBUG, INFO, WARNING, ERROR, or CRITICAL)
)

Make this modification to report.py and have it write to a logfile called report.log. Experiment with this modified version. For example, you should be able to run the program like this:

bash % python report.py Data/missing.csv Data/prices.csv
      Name     Shares      Price     Change
---------- ---------- ---------- ----------
        AA        100       9.22     -22.98
       IBM         50     106.28      15.18
       CAT        150      35.46     -47.98
        GE         95      13.48     -26.89
      MSFT         50      20.89     -44.21
bash % cat report.log
WARNING:fileparse:Row 4: Couldn't convert ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Couldn't convert ['IBM', '', '70.44']
bash %

Commentary:

The logging module has a large number of advanced features and configuration options. More information can be found in practical-python/Optional/Logging.pdf.

Links