Subsections

 
3 Python Data Structures

Python has a few, very useful, built-in data structures that you will see used everywhere.

 
3.1 Strings

The simplest and most familiar data structure is a string. String constants can be written with either single or double quotes. Python has a number of built-in string operations. Strings can be indexed and slices extracted using a simple subscripting notation. For example,

>>> x = 'test string'
>>> x[2]
's'
>>> x[1:3]
'es'
>>> x[-2]
'n'
>>> x[2:]
'st string'
>>> x[:2]
'te'

These few examples illustrate some unusual features. First, indexing is 0 based. The first element of a N-character string is number 0, and the last character is number N-1. (This is familiar to C and IDL users but differs from IRAF and Fortran.)

When a range (or ``slice'' in Python terminology) is specified, the second number represents an index that is not included as part of the range. One should view indexing as working like this

 0   1   2   3   4   5   6   7   8   9  10  11 positive indices
   t   e   s   t       s   t   r   i   n   g
-11 -10 -9  -8  -7  -6  -5  -4  -3  -2  -1     negative indices

The indices effectively mark the gaps between the items in the string or list. Specifying 2:4 means everything between 2 and 4. Python sequences (including strings) can be indexed in reverse order as well. The index -1 represents the last element, -2 the penultimate element, and so forth. If the first index in a slice is omitted, it defaults to the beginning of the string; if the last is omitted, it defaults to the end of the string. So x[-4:] contains the last 4 elements of the string ('ring' in the above example).

Strings can be concatenated using the addition operator

>>> print "hello" + " world"
'hello world'

And they can be replicated using the multiplication operator

>>> print "hello"*5
'hellohellohellohellohello'

There is also a string module (see below for more on modules) in the standard Python library. It provides many additional operations on strings. In the latest version of Python (2.0), strings also have methods that can be used for further manipulations. For example s.find('abc') returns the (zero-based) index of the first occurence of the substring 'abc' in string s (-1 if it is not found.) The string module equivalent is string.find(s,'abc').

 
3.2 Lists

One can view lists as generalized, dynamically sized arrays. A list may contain a sequence of any legitimate Python objects: numbers, strings functions, objects, and even other lists. The objects in a list do not have to be the same type. A list can be specifically constructed using square brackets and commas:

x = [1,4,9,"first three integer squares"]

Elements of lists can be accessed with subscripts and slices just like strings. E.g.,

>>> x[2]
9
>>> x[2:]
[9,"first three integer squares"]

They can have items appended to them:

>>> x.append('new end of list')
  or
>>> x = x + ['new end of list'] # i.e., '+' concatenates lists
>>> x
[1,4,9,'first three integer squares','new end of list']

(Here # is the Python comment delimiter.) Elements can be deleted and inserted:

>>> del x[3]
>>> x
[1, 4, 9, 'new end of list']
>>> x.insert(1,'two')
>>> x
[1, 'two', 4, 9, 'new end of list']

The product of a list and a constant is the result of repeatedly concatenating the list to itself:

>>> 2*x[0:3]
[1, 'two', 4, 1, 'two', 4]

There are a number of other operations possible including sorting and reversing:

>>> x = [5, 3, 1, 2, 4, 6]
>>> x.sort()
>>> print x
[1, 2, 3, 4, 5, 6]
>>> x.reverse()
>>> print x
[6, 5, 4, 3, 2, 1]

A list can have any number of elements, including zero: [] represents the empty list.

 
3.3 Mutability

Python data structures are either mutable (changeable) or not. Strings are not mutable - once created they cannot be modified in situ (though it is easy to create new, modified versions). Lists are mutable, which means lists can be changed after being created; a particular element of a list may be modified. Either single elements or slices of lists can be modified by assignment:

>>> x
[1,4,9,'new end of list']
>>> x[2] = 16
>>> x
[1,4,16,'new end of list']
>>> x[2:3] = [9,16,25]
[1,4,9,16,25,'new end of list']

 
3.4 Tuples

Tuples are essentially immutable lists. There is a different notation used to construct tuples (parentheses instead of brackets), but otherwise the same operations apply to them as to lists as long as the operation does not change the tuple. (For example, sort, reverse and del cannot be used.)

x = (1,"string")
x = ()   # empty tuple
x = (2,) # one element tuple requires a trailing comma (which is legal
         # for any length tuple also) to distinguish them from expressions

 
3.5 Dictionaries

Dictionaries are hash tables that allow elements to be retrieved by a key. The key can be any Python immutable type such as an integer, string, or tuple. An example would be:

>>> employee_id = "ricky":11,"fred":12,"ethel":15
>>> print employee_id["fred"]
12

Note that braces are used to enclose dictionary definitions. New items are easily added:

>>> employee_id["lucy"] = 16

will create a new entry.

There are many operations available for dictionaries; for example, you can get a list of all the keys:

>>> print employee_id.keys()
['ricky', 'lucy', 'ethel', 'fred']

The order of the keys is random, but you can sort the list if you like. You can delete entries:

>>> del employee_id['fred']
>>> print employee_id.keys()
['ricky', 'lucy', 'ethel']

It is a rare Python program that doesn't use these basic data structures (strings, lists, and dictionaries) routinely and heavily. They can be mixed and matched in any way you wish. Dictionaries can contain lists, tuples, and other dictionaries (and different entries can contain different types of objects), and the same is true of lists and tuples.

Questions or comments? Contact help@stsci.edu
Documented updated on 2004 Jun 1