Exercise 2.1
In the last few exercises, you wrote a program that read a datafile Data/portfolio.csv
. Using the csv
module,
it is easy to read the file row-by-row. For example:
>>> import csv
>>> f = open('Data/portfolio.csv')
>>> f_csv = csv.reader(f)
>>> next(f_csv)
['name', 'shares', 'price']
>>> row = next(f_csv)
>>> row
['AA', '100', '32.20']
>>>
Although reading the file is easy, you often want to do more with the data than simply read it. For instance, perhaps you want to store it and start performing some calculations on it. Unfortunately, a raw "row" of data doesn’t give you enough to work with. For example, even a simple math calculation doesn’t work:
>>> row = ['AA', '100', '32.20']
>>> cost = row[1] * row[2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't multiply sequence by non-int of type 'str'
>>>
To do more, you typically want to interpret the raw data in some way and turn it into a more useful kind of object so that you can work with it later. Two simple options are tuples or dictionaries.
(a) Tuples
At the interactive prompt, create the following tuple that represents the above row, but with the numeric columns converted to proper numbers:
>>> t = (row[0], int(row[1]), float(row[2]))
>>> t
('AA', 100, 32.2)
>>>
Using this, you can now calculate the total cost by multiplying the shares and the price:
>>> cost = t[1] * t[2]
>>> cost
3220.0000000000005
>>>
Tuples are read-only. Verify this by trying to change the number of shares to 75.
>>> t[1] = 75
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>>
Although you can’t change tuple contents, you can always create a completely new tuple that replaces the old one. Try this:
>>> t = (t[0], 75, t[2])
>>> t
('AA', 75, 32.2)
>>>
Tuples are often used to pack and unpack values into variables. Try the following:
>>> name, shares, price = t
>>> name
'AA'
>>> shares
75
>>> price
32.2
>>>
Take the above variables and pack them back into a tuple
>>> t = (name, 2*shares, price)
>>> t
('AA', 150, 32.2)
>>>
(b) Dictionaries as a data structure
An alternative to a tuple is to create a dictionary instead. For example:
>>> d = {
'name' : row[0],
'shares' : int(row[1]),
'price' : float(row[2])
}
>>> d
{'price': 32.2, 'name': 'AA', 'shares': 100}
>>>
Calculate the total cost of this holding:
>>> cost = d['shares'] * d['price']
>>> cost
3220.0000000000005
>>>
Compare this example with the same calculation involving tuples above. Change the number of shares to 75.
>>> d['shares'] = 75
>>> d
{'price': 32.2, 'name': 'AA', 'shares': 75}
>>>
Unlike tuples, dictionaries can be freely modified. Add some attributes:
>>> d['date'] = (6, 11, 2007)
>>> d['account'] = 12345
>>> d
{'date': (6, 11, 2007), 'price': 32.2, 'account': 12345, 'name': 'AA', 'shares': 75}
>>>
(c) Some additional dictionary operations
If you turn a dictionary into a list, you’ll get all of its keys:
>>> list(d)
['date', 'price', 'account', 'name', 'shares']
>>>
This operation is also sometimes performed using the keys()
method:
>>> d.keys()
['date', 'price', 'account', 'name', 'shares']
>>>
Similarly, if you use the for
statement to iterate on a dictionary, you will get the keys:
>>> for k in d:
print 'k =', k
k = date
k = price
k = account
k = name
k = shares
>>>
Try this variant that performs a lookup at the same time:
>>> for k in d:
print k, '=', d[k]
date = (6, 11, 2007)
price = 32.2
account = 12345
name = AA
shares = 75
>>>
A more elegant way to work with keys and values together is to use the items()
method.
This turns a dictionary into a list of (key, value)
tuples:
>>> items = d.items()
>>> items
[ ('date', (6, 11, 2007)), ('price', 32.2), ('account', 12345), ('name', 'AA'), ('shares', 75)]
>>> for k, v in d.items():
print k, '=', v
date = (6, 11, 2007)
price = 32.2
account = 12345
name = AA
shares = 75
>>>
If you have a list of tuples such as items
, you can create
a dictionary using the dict()
function. Try it:
>>> items
[('date', (6, 11, 2007)), ('price', 32.2), ('account', 12345), ('name', 'AA'), ('shares', 75)]
>>> d = dict(items)
>>> d
{'date': (6, 11, 2007), 'price': 32.2, 'account': 12345, 'shares': 75, 'name': 'AA'}
>>>