Exercise 2.2

Objectives:

  • How to read data from a file into memory.

  • Using lists and tuples to represent the contents of a column-oriented datafile as a matrix (or 2-d array) of values—similar to how data is organized in a spreadsheet.

  • Using lists and dictionaries to represent the contents of a column-oriented data file as a list of rows with named fields—similar to how you might work with data stored in a database.

  • Using a dictionary to hold data where you want to perform fast random lookups.

Files Created: report.py

(a) A List of Tuples

The file Data/portfolio.csv contains a list of stocks in a portfolio. In Exercise 1.7, you wrote a function portfolio_cost(filename) that read this file and performed a simple calculation. Your code should have looked something like this:

# pcost.py

import csv
def portfolio_cost(filename):
    '''Computes the total cost (shares*price) of a portfolio file'''
    total_cost = 0.0

    f = open(filename)
    f_csv = csv.reader(f)
    headers = next(f_csv)
    for row in f_csv:
        nshares = int(row[1])
        price = float(row[2])
        total_cost += nshares * price
    f.close()
    return total_cost

Using this code as a rough guide, create a new file report.py. In that file, define a function read_portfolio(filename) that opens a given portfolio file and reads it into a list of tuples.

To do this, you’re going to make a few minor modifications to the above code. First, instead defining total_cost = 0, you’ll make a variable that’s initially set to an empty list. For example:

portfolio = []

Next, instead of totaling up the cost, you’ll simply turn each row into a tuple exactly as you just did in Exercise 2.1 and append it to this list. For example:

for row in f_csv:
    holding = (row[0], int(row[1]), float(row[2]))
    portfolio.append(holding)

Finally, you’ll return the resulting portfolio list.

Experiment with your function interactively (just a reminder that in order to do this, you first have to run the report.py program in the interpreter):

>>> portfolio = read_portfolio('Data/portfolio.csv')
>>> portfolio
... look at the output ...
>>>
>>> portfolio[0]
('AA', 100, 32.2)
>>> portfolio[1]
('IBM', 50, 91.1)
>>> portfolio[1][1]
50
>>> total = 0.0
>>> for s in portfolio:
        total += s[1] * s[2]

>>> print total
44671.15
>>>

This list of tuples that you have created is very similar to a 2-D array. For example, you can access a specific column and row using a lookup such as portfolio[row][column] where row and column are integers.

That said, you can also rewrite the last for-loop using a statement like this:

>>> total = 0.0
>>> for name, shares, price in portfolio:
          total += shares*price

>>> print total
44671.15
>>>

(b) A List of Dictionaries

Take the function you wrote in part (a) and modify to represent each stock in the portfolio with a dictionary instead of a tuple. In this dictionary use the fieldnames of "name", "shares", and "price" to represent the different columns in the input file.

Experiment with this new function in the same manner as you did in part (a).

>>> portfolio = read_portfolio('Data/portfolio.csv')
>>> portfolio
... look at the output ...
>>> portfolio[0]
{'price': 32.2, 'name': 'AA', 'shares': 100}
>>> portfolio[1]
{'price': 91.1, 'name': 'IBM', 'shares': 50}
>>> portfolio[1]['shares']
50
>>> total = 0.0
>>> for s in portfolio:
        total += s['shares']*s['price']

>>> print total
44671.15
>>>

Here, you will notice that the different fields for each entry are accessed by key names instead of numeric column numbers. This is often preferred because the resulting code is easier to read later.

Note

Viewing large dictionaries and lists can be messy. To clean up the output for debugging, considering using the pprint function. For example:

>>> from pprint import pprint
>>> pprint(portfolio)
... look at the output ...
>>>

(c) Dictionaries as a container

A dictionary is a useful way to keep track of items where you want to look up items using an index other than an integer. In the Python shell, try playing with a dictionary:

>>> prices = { }
>>> prices['IBM'] = 92.45
>>> prices['MSFT'] = 45.12
>>> prices
... look at the result ...
>>> prices['IBM']
92.45
>>> prices['AAPL']
... look at the result ...
>>> 'AAPL' in prices
False
>>>

The file Data/prices.csv contains a series of lines with stock prices. The file might look something like this:

"AA",9.22
"AXP",24.85
"BA",44.85
"BAC",11.27
"C",3.72
...

Write a function read_prices(filename) that reads a set of prices such as this into a dictionary where the keys of the dictionary are the stock names and the values in the dictionary are the stock prices. To do this, start with an empty dictionary and start inserting values into it just as you did above (you’ll be reading the values from a file however). We’ll use this data structure to quickly lookup the price of a given stock name.

A few little tips that you’ll need for this part. First, make sure you use the csv module just as you did before—there’s no need to reinvent the wheel here. For example:

>>> import csv
>>> f = open('Data/prices.csv', 'r')
>>> f_csv = csv.reader(f)
>>> for row in f_csv:
        print row


['AA', '9.22']
['AXP', '24.85']
...
[]
>>>

The other little complication is that the Data/prices.csv file may have some blank lines in it (notice how the last row of data above is simply an empty list—meaning no data was present on that line). There’s a possibility that this could cause your program to die with an exception. Use the try and except statements to catch this as appropriate (or add some other kind of code to avoid errors).

Once you have written your read_prices() function, test it interactively to make sure it works:

>>> prices = read_prices('Data/prices.csv')
>>> prices['IBM']
106.28
>>> prices['MSFT']
20.89
>>>

(d) Finding out if you can retire

Tie all of this work together by adding statements to your report.py program that takes the list of stocks in part (b) and the dictionary of prices and computes the current value of the portfolio along with the gain/loss.

Links