Exercise 1.4

Objectives:

  • How to create and manipulate text strings.

  • Further use of Python’s interactive mode as a tool for experimentation.

Files Created: None

Note

In exercises where you are supposed to interact with the interpreter, >>> is the interpreter prompt that you get when Python wants you to type a new statement. Some statements in the exercise span multiple lines—to get these statements to run, you may have to hit return a few times. Just a reminder that you DO NOT type the >>> when working these examples.

In this exercise, we experiment with operations on Python’s string type. You may want to do most of this exercise at the Python interactive prompt where you can easily see the results.

Define a string containing a series of stock ticker symbols like this:

>>> symbols = 'AAPL,IBM,MSFT,YHOO,SCO'
>>>

Now, let’s experiment with different string operations:

(a) Extracting individual characters and substrings

Strings are arrays of characters. Try extracting a few characters:

>>> symbols[0]
'A'
>>> symbols[1]
'A'
>>> symbols[2]
'P'
>>> symbols[-1]        # Last character
'O'
>>> symbols[-2]        # Negative indices are from end of string
'C'
>>>

(b) Strings as read-only objects

In Python, strings are read-only. Verify this by trying to change the first character of symbols to a lower-case a.

>>> symbols[0] = 'a'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
>>>

(c) String concatenation

Although string data is read-only, you can always reassign a variable to a newly created string. Try the following statement which concatenates a new symbol "GOOG" to the end of symbols:

>>> symbols = symbols + 'GOOG'
>>> symbols
'AAPL,IBM,MSFT,YHOO,SCOGOOG'
>>>

Oops! That’s not what we wanted. Let’s fix it:

>>> symbols = symbols[:-4]        # All but last 4 chars
>>> symbols
'AAPL,IBM,MSFT,YHOO,SCO'
>>> symbols = symbols + ',GOOG'   # Note the leading comma
>>> symbols
'AAPL,IBM,MSFT,YHOO,SCO,GOOG'
>>>

Now, try adding "HPQ" to the beginning of symbols like this:

>>> symbols = 'HPQ,' + symbols
>>> symbols
'HPQ,AAPL,IBM,MSFT,YHOO,SCO,GOOG'
>>>

It should be noted in both of these examples, the original string symbols is NOT being modified "in place" (i.e., modifications don’t overwrite the memory currently being used to to store the string contents). Instead, a completely new string is created. The variable name symbols is just reassigned to the new value. Afterwards, the old string is destroyed since it’s not being used anymore.

(d) Membership testing (substring testing)

Experiment with the in operator to check for substrings. At the interactive prompt, try these operations:

>>> 'IBM' in symbols
True
>>> 'AA' in symbols
True
>>> 'CAT' in symbols
False
>>>

Make sure you understand why the check for "AA" returned True.

(e) String Methods

At the Python interactive prompt, try experimenting with some of the string methods.

>>> symbols.lower()
'hpq,aapl,ibm,msft,yhoo,sco,goog'
>>> symbols
'HPQ,AAPL,IBM,MSFT,YHOO,SCO,GOOG'
>>>

Remember, strings are always read-only. If you want to save the result of an operation, you need to place it in a variable:

>>> lowersyms = symbols.lower()
>>> lowersyms
'hpq,aapl,ibm,msft,yhoo,sco,goog'
>>>

Try some more operations:

>>> symbols.find('MSFT')
13
>>> symbols[13:17]
'MSFT'
>>> symbols = symbols.replace('SCO','DOA')
>>> symbols
'HPQ,AAPL,IBM,MSFT,YHOO,DOA,GOOG'
>>> for s in symbols:
        print 's=', s

... see what happens

By the way, the for statement is what Python uses to iterate over the contents of something. In the case of a string, it iterates over the individual letters—one at a time.

Discussion

As you start to experiment with the interpreter, you often want to know more about the operations supported by different objects. For example, how do you find out what operations are available on a string?

Depending on your Python environment, you might be able to see a list of available methods via tab-completion. For example, try typing this:

>>> s = 'hello world'
>>> s.<tab key>
>>>

If hitting tab doesn’t do anything, you can fall back to the builtin-in dir() function. For example:

>>> s = 'hello'
>>> dir(s)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>>

dir() produces a list of all operations that can appear after the (.). For example:

>>> s.upper()
'HELLO'
>>>

Use the help() command to get more information about a specific operation:

>>> help(s.upper)
Help on built-in function upper:

upper(...)
    S.upper() -> string

    Return a copy of the string S converted to uppercase.
>>>

IDEs and alternative interactive shells often give you more help here. For example, a popular alternative to Python’s normal interactive mode is IPython (http://ipython.org). IPython provides some nice features such as tab-completion of method names, more integrated help and more.

Links