Exercise 2.1

Objectives:

  • Learn how to use a tuple or dictionary to represent data structures.

Files Created: None.

In the last few exercises, you wrote a program that read a datafile Data/portfolio.csv. Using the csv module, it is easy to read the file row-by-row. For example:

>>> import csv
>>> f = open('Data/portfolio.csv')
>>> f_csv = csv.reader(f)
>>> next(f_csv)
['name', 'shares', 'price']
>>> row = next(f_csv)
>>> row
['AA', '100', '32.20']
>>>

Although reading the file is easy, you often want to do more with the data than simply read it. For instance, perhaps you want to store it and start performing some calculations on it. Unfortunately, a raw "row" of data doesn’t give you enough to work with. For example, even a simple math calculation doesn’t work:

>>> row = ['AA', '100', '32.20']
>>> cost = row[1] * row[2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't multiply sequence by non-int of type 'str'
>>>

To do more, you typically want to interpret the raw data in some way and turn it into a more useful kind of object so that you can work with it later. Two simple options are tuples or dictionaries.

(a) Tuples

At the interactive prompt, create the following tuple that represents the above row, but with the numeric columns converted to proper numbers:

>>> t = (row[0], int(row[1]), float(row[2]))
>>> t
('AA', 100, 32.2)
>>>

Using this, you can now calculate the total cost by multiplying the shares and the price:

>>> cost = t[1] * t[2]
>>> cost
3220.0000000000005
>>>
Note

Is math broken in Python? What’s the deal with the answer of 3220.0000000000005? This is an artifact of the floating point hardware on your computer only being able to accurately represent decimals in Base-2, not Base-10. For even simple calculations involving base-10 decimals, small errors are introduced. This is normal, although perhaps a bit surprising if you haven’t seen it before. This happens in all programming languages that use floating point decimals, but it often gets hidden when printing. Even Python rounds the result if you use the print statement:

>>> print cost
3220.0
>>>

Tuples are read-only. Verify this by trying to change the number of shares to 75.

>>> t[1] = 75
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>>

Although you can’t change tuple contents, you can always create a completely new tuple that replaces the old one. Try this:

>>> t = (t[0], 75, t[2])
>>> t
('AA', 75, 32.2)
>>>
Note

Whenever you reassign an existing variable name like this, the old value is discarded. Although the above assignment might look like you are modifying the tuple, you are actually creating a new tuple and throwing the old one away.

Tuples are often used to pack and unpack values into variables. Try the following:

>>> name, shares, price = t
>>> name
'AA'
>>> shares
75
>>> price
32.2
>>>

Take the above variables and pack them back into a tuple

>>> t = (name, 2*shares, price)
>>> t
('AA', 150, 32.2)
>>>

(b) Dictionaries as a data structure

An alternative to a tuple is to create a dictionary instead. For example:

>>> d = {
      'name' : row[0],
      'shares' : int(row[1]),
      'price'  : float(row[2])
    }
>>> d
{'price': 32.2, 'name': 'AA', 'shares': 100}
>>>

Calculate the total cost of this holding:

>>> cost = d['shares'] * d['price']
>>> cost
3220.0000000000005
>>>

Compare this example with the same calculation involving tuples above. Change the number of shares to 75.

>>> d['shares'] = 75
>>> d
{'price': 32.2, 'name': 'AA', 'shares': 75}
>>>

Unlike tuples, dictionaries can be freely modified. Add some attributes:

>>> d['date'] = (6, 11, 2007)
>>> d['account'] = 12345
>>> d
{'date': (6, 11, 2007), 'price': 32.2, 'account': 12345, 'name': 'AA', 'shares': 75}
>>>
Note

Dictionaries don’t store their data in any kind of predictable order. Thus, the order that the keys are listed in the above example is arbitrary (and for all practical purposes random). In most cases, you don’t care about the order—you simply want to store and retrieve the data.

(c) Some additional dictionary operations

If you turn a dictionary into a list, you’ll get all of its keys:

>>> list(d)
['date', 'price', 'account', 'name', 'shares']
>>>

This operation is also sometimes performed using the keys() method:

>>> d.keys()
['date', 'price', 'account', 'name', 'shares']
>>>

Similarly, if you use the for statement to iterate on a dictionary, you will get the keys:

>>> for k in d:
        print 'k =', k

k = date
k = price
k = account
k = name
k = shares
>>>

Try this variant that performs a lookup at the same time:

>>> for k in d:
        print k, '=', d[k]

date = (6, 11, 2007)
price = 32.2
account = 12345
name = AA
shares = 75
>>>

A more elegant way to work with keys and values together is to use the items() method. This turns a dictionary into a list of (key, value) tuples:

>>> items = d.items()
>>> items
[ ('date', (6, 11, 2007)), ('price', 32.2), ('account', 12345), ('name', 'AA'), ('shares', 75)]
>>> for k, v in d.items():
        print k, '=', v

date = (6, 11, 2007)
price = 32.2
account = 12345
name = AA
shares = 75
>>>

If you have a list of tuples such as items, you can create a dictionary using the dict() function. Try it:

>>> items
[('date', (6, 11, 2007)), ('price', 32.2), ('account', 12345), ('name', 'AA'), ('shares', 75)]
>>> d = dict(items)
>>> d
{'date': (6, 11, 2007), 'price': 32.2, 'account': 12345, 'shares': 75, 'name': 'AA'}
>>>
Links