Exercise 2.5
A preliminary step, take your report.py
program and run it. Now, at
the Python interactive prompt, type statements to perform the
operations described below. These operations perform various kinds of
data reductions, transforms, and queries on the portfolio data.
(a) List comprehensions
Try a few simple list comprehensions just to become familiar with the syntax.
>>> nums = [1,2,3,4]
>>> squares = [x*x for x in nums]
>>> squares
[1, 4, 9, 16]
>>> twice = [2*x for x in nums if x > 2]
>>> twice
[6, 8]
>>>
Notice how the list comprehensions are creating a new list with the data suitably transformed or filtered.
(b) Sequence Reductions
Compute the total cost of the portfolio using a single Python statement.
>>> cost = sum([s['shares']*s['price'] for s in portfolio])
>>> cost
44671.15
>>>
After you have done that, show how you can compute the current value of the portfolio using a single statement.
>>> value = sum([s['shares']*prices[s['name']] for s in portfolio])
>>> value
28686.1
>>>
(c) Data Queries
Try the following examples of various data queries. First, a list of all portfolio holdings with more than 100 shares.
>>> more100 = [s for s in portfolio if s['shares'] > 100]
>>> more100
[{'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}]
>>>
All portfolio holdings for MSFT and IBM stocks.
>>> msftibm = [s for s in portfolio if s['name'] in ['MSFT','IBM']]
>>> msftibm
[{'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}, {'price': 70.44, 'name': 'IBM', 'shares': 100}]
>>>
A list of all portfolio holdings that cost more than $10000.
>>> cost10k = [s for s in portfolio if s['shares']*s['price'] > 10000]
>>> cost10k
[{'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}]
>>>
(d) Data Extraction
Show how you could build a list of tuples (name, shares)
where name
and shares
are taken from portfolio
.
>>> name_shares =[(s['name'],s['shares']) for s in portfolio]
>>> name_shares
[('AA', 100), ('IBM', 50), ('CAT', 150), ('MSFT', 200), ('GE', 95), ('MSFT', 50), ('IBM', 100)]
>>>
Show how you create a set of all unique stock symbols in portfolio
.
>>> names = set([s['name'] for s in portfolio])
>>> names
set(['AA', 'GE', 'IBM', 'MSFT', 'CAT'])
>>>
This last step can more compactly be expressed with a feature known as a "set comprehension". Simply write
a list comprehension, but change the square brackets ([
,]
) to curly braces ({
, }
).
>>> names = { s['name'] for s in portfolio }
>>> names
set(['AA', 'GE', 'IBM', 'MSFT', 'CAT'])
>>>
Build a dictionary that maps the name of a stock to the total number of shares held.
>>> holdings = dict.fromkeys(names, 0)
>>> holdings
{'AA': 0, 'GE': 0, 'IBM': 0, 'MSFT': 0, 'CAT': 0}
>>> for name, shares in name_shares:
holdings[name] += shares
>>> holdings
{'AA': 100, 'GE': 95, 'IBM': 150, 'MSFT': 250, 'CAT': 150}
>>>
The dict.fromkeys()
method creates a dictionary from a set of keys,
initializing all of the values to a value you provide. This was done
to set up initial counts for tabulating the total number of shares in
the for
loop that follows. This initialization could also be performed
using a dictionary comprehension:
>>> holdings = { name:0 for name in names }
>>> holdings
{'AA': 0, 'GE': 0, 'IBM': 0, 'MSFT': 0, 'CAT': 0}
>>>
(e) Extracting Data From CSV Files (Advanced)
Knowing how to use various combinations of list, set, and dictionary comprehensions can be useful in various forms of data processing. Here’s an example that shows how to extract selected columns from a CSV file.
First, read a row of header information from a CSV file:
>>> import csv
>>> f = open('Data/portfoliodate.csv')
>>> f_csv = csv.reader(f)
>>> headers = next(f_csv)
>>> headers
['name', 'date', 'time', 'shares', 'price']
>>>
Next, define a variable that lists the columns that you actually care about:
>>> columns = ['name', 'shares', 'price']
>>>
Now, locate the indices of the above columns in the source CSV file:
>>> indices = [ (colname, headers.index(colname)) for colname in columns ]
>>> indices
[('name', 0), ('shares', 3), ('price', 4)]
>>>
Finally, read a row of data and turn it into a dictionary using a dictionary comprehension:
>>> row = next(f_csv)
>>> record = { colname: row[index] for colname, index in indices } # dict-comprehension
>>> record
{'price': '32.20', 'name': 'AA', 'shares': '100'}
>>>
If you’re feeling comfortable with what just happened, read the rest of the file:
>>> portfolio = [ {colname: row[index] for colname, index in indices} for row in f_csv ]
>>> portfolio
[{'price': '91.10', 'name': 'IBM', 'shares': '50'}, {'price': '83.44', 'name': 'CAT', 'shares': '150'}, {'price': '51.23', 'name': 'MSFT', 'shares': '200'}, {'price': '40.37', 'name': 'GE', 'shares': '95'}, {'price': '65.10', 'name': 'MSFT', 'shares': '50'}, {'price': '70.44', 'name': 'IBM', 'shares': '100'}]
>>>
Oh my, you just reduced much of the read_portfolio()
function to a single statement.