Exercise 3.3
(a) Raising exceptions
The parse_csv()
function you wrote in the last section allows
user-specified columns to be selected, but that only works if the
input data file has column headers. Modify the code so that an
exception gets raised if both the select
and has_headers=False
arguments are passed. For example:
>>> parse_csv('Data/prices.csv', select=['name','price'], has_headers=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "fileparse.py", line 9, in parse_csv
raise RuntimeError("select argument requires column headers")
RuntimeError: select argument requires column headers
>>>
(b) Catching exceptions
The parse_csv()
function you wrote is used to process the entire
contents of a file. However, in the real-world, it’s possible that
input files might have corrupted, missing, or dirty data. Try this experiment:
>>> portfolio = parse_csv('Data/missing.csv', types=[str, int, float])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "fileparse.py", line 36, in parse_csv
row = [func(val) for func, val in zip(types, row)]
ValueError: invalid literal for int() with base 10: ''
>>>
Modify the parse_csv()
function to catch all ValueError
exceptions
generated during record creation and print a warning message for rows
that can’t be converted. The message should include the row number and
information about the reason why it failed.
To test your function, try reading the file Data/missing.csv
above.
For example:
>>> portfolio = parse_csv('Data/missing.csv', types=[str, int, float])
Row 4: Couldn't convert ['MSFT', '', '51.23']
Row 4: Reason invalid literal for int() with base 10: ''
Row 7: Couldn't convert ['IBM', '', '70.44']
Row 7: Reason invalid literal for int() with base 10: ''
>>>
>>> portfolio
[{'price': 32.2, 'name': 'AA', 'shares': 100}, {'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 40.37, 'name': 'GE', 'shares': 95}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}]
>>>
(c) Silencing Errors
Modify the parse_csv()
function so that parsing error messages can
be silenced if desired by the user. For example:
>>> portfolio = parse_csv('Data/missing.csv', types=[str,int,float], silence_errors=True)
>>> portfolio
[{'price': 32.2, 'name': 'AA', 'shares': 100}, {'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 40.37, 'name': 'GE', 'shares': 95}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}]
>>>