MTHSC 360 Python and Data Structures

3657 days ago by warner


Introduction to Python and Data Structures

At this point we need to pause and look a little more carefully at how Sage works with Python.  Python is a modern off-the-shelf scripting language that is very friendly to Scientific Computing and to Object Oriented Programming (OOP).  Python's support for OOP is absolutely essential for Sage's goal of integrating different Open Source Math packages.


This first example illustrates Pyhton's basic control flow.  The example contains a while loop with an if-then-else case statement.  For at least 3 decades good programming practice has strongly suggested that you indent your code to reflect its logical block structure.  With Python this is no longer a suggestion.  There are no begin-ends or other bracketing notation.  As a reward for proper indenting, you get shorter programs.  Also note that = is the assignment operator, == is the relational operator for equality, % is the mod operator.

k = 0 while k<20: if k%3 == 0: print "orange" elif k%5 == 0: print "white" else: print k k += 1 

Python has a for clause that walks through a list.  It also has a function for generating lists of integers, called range.

Python also indexes the world starting at zero.

for k in range(20): if k%3 == 0: print "orange" elif k%5 == 0: print "white" else: print k 

Python defines functions with the def command.

def cheer(N): for k in range(N): if k%3 == 0: print "orange" elif k%5 == 0: print "white" else: print k 

Python has three built-in data structures: list, tuple, and dictionary


A list in Python is entered and displayed with square brackets.  When Python indexes into a list it starts at 0, and the index is specified with square brackets.  When indexing into a 2-dimensional list, Python uses two separate indices.  The first index indicates which list in the list of lists, and the second index indicates which element in that list.

A = [1, 2, 3, 4] print A A[0] = 5 print A B = [[1,2,3],[4,5,6]] print B B[0][2] = 13 print B 

Python provides a number of "slicing" operations

print A[1:3] print A[:3] print A[2:] print A[:] print A[-2:] 

Python does not deal directly with matrices, although Sage definitely does.  In Python adding two lists is just concatenation.

print A print B C = [4, 3, 2, 1] print C print A+C print A+B 

In the last example  A+B  is a list consisting of 6 elements the first 4 elements are numbers and the last two elements are lists.

Lists in Python are very general collections.  In Python we can walk through a list in 3 different ways. 


# The following list contains a number, a string, and a list a = [1.25, "name", A] print a 
for k in range(len(a)): print a[k] 
for item in a: print item 
for k,item in enumerate(a): print k,item 

We can change the items in a list.

a[1] = "Robert" print a 

Python has a very flexible initialization construction, called list comprehension.  It mirrors the way we typically define a set in mathematics.

X = [0 for k in range(5)] print X Y = [1 for k in range(5)] print Y Z = [(k+1)^2 for k in range(5)] print Z 

Even trickier constructions are possible by adding a conditional phrase.

small_primes = [ p for p in range(100) if is_prime(p)] print small_primes 

Suppose that we want to construct the list of the Fibonacci numbers that are less than 100.  

The following Python code will do the job.

fibo = [0, 1] while fibo[-1]+fibo[-2] < 100: fibo = fibo + [fibo[-1]+fibo[-2]] print fibo 


The second built-in Python data structure is the tuple.  It is just like a list except that it gets set up with parentheses instead of square brackets.  Tuples are different from lists in that they are immutable - they can not be changed.  Think of them as read only data structures.

b = (1.25, "name", A) print b print b[1] 
b[1] = "Bob" 

Beware!  Although the tuple can not be changed it is possible that some of the elements of the tuple can be changed.

print A print b A[0:2] = [-3, -2] print b b[2][0] = 1 print b print A 


While the list is enclosed with square brackets and the tuple is enclosed with parentheses, the dictionary is enclosed with curly braces. The dictionary also provides arbitrary keys in lieu of indexing, although the notation is similar.  A dictionary is a list of pairs.  The first member of each pair is the key, and a key can only occur once in a dictionary.  The key is followed by a colon and the value, the second item of the pair.  Keys must be unique and immutable, but otherwise they are completely arbitrary.  The values are totally arbitrary.

p1 = {'first_name': 'John', 'last_name': 'Doe', 'age': 25, 'weight': 145, 'children':['Jane','Joe']} for key in p1: print key,":",p1[key] 
p1['age'] = 37 for key in p1: print key,":",p1[key] 
c = {1:'name', 'b':[1,2,'abe'], (1,2,'abe'): 3.14159} for key in c: print key,":",c[key] print c[(1,2,'abe')] = sqrt(2.0) c['b'][1] = 5 for key in c: print key,":",c[key] 

Sage makes extensive use of Python classes.  The type command tells us the name of the class of an object. 

Integers are a particularly interesting case.

for k in range(3): print k,type(k) print a = [0,1,2] for k in a: print k,type(k) print for k in srange(3): print k,type(k) 
# What is pi? type(pi); type(3.14159) 

# How do we tell Sage to compute pi to 1000 digits if we don't remember the command? # The tab key after "." of "(" generates a help display which can be used to # autocomplete the statement. pi.n(digits=1000) 
print type(x), type(b), type(c) 
# Sage can solve equations for one variable in terms of others. var('x b c') print type(x), type(b), type(c) solve([x^2 + b*x + c == 0],x) 
# Sage can solve linear systems of algebraic equations # Note the introduction of the symbolic variable y var('y') eq4 = x-1 == y+1 eq5 = x+1 == 2*(y-1) eq4;eq5 

This example underscored the fact that the single "=" is assignment and the double "==" is equality.

It also showed that multiple statements can be on the same line when separated by ";"

sol2 = solve([eq4,eq5],x,y) sol2 

In this example, the solution is returned in a list.  Lists are the primary data structure workhorse. However, we can also have the solution(s) returned as a list of dictionaries which provides access by name.

sol2 = solve([eq4,eq5],x,y,solution_dict=True) sd = sol2[0] print sd print (sd[x].n(digits=2),sd[y].n(digits=5)) 

Object Oriented Programming

Python's support of Object Oriented Programming provides Python with the possibility of accomplishing many more tasks than Matlab.  As an example, we can look at Sage's use of matrices.  We will start by generating some random matrices of integers.

A = random_matrix(ZZ,5) print A 
# How about some really big integers? A = random_matrix(ZZ, 5, x=2^64) print type(A) print A 

Use tab completion to see what A knows how to do.  In OOP the functions that belong to an object are called methods.  Experiment with different methods.  In Sage you can create matrices over any ring.  Of course, some of the methods will vary depending on the ring.


To create your own objects, you define the class that those objects would belong to and the methods that it allows.  Here are examples of three extremely important Data Structures - Stacks, Queues, and Priority Queues.

A Stack Class

The following block defines a stack class with five methods - push, pop, peek, clear, and display.  There is also an initialization method, __init__, which is invoked whenever a Stack object is created.  This method insures that the stack owns an empty list, _s.  Using an underscore as the first letter in the name of a variable is a Python technique for indicating that it is a private variable.

''' This block defines a Stack class Each instance is initialized with an empty list. The class provides 5 methods: push, pop, peek, clear, and display. Daniel D. Warner October 4, 2011 ''' class Stack: def __init__(self): self._s = [] def push(self,x): self._s.append(x) def pop(self): if 0 < len(self._s): return self._s.pop() else: return None def peek(self): if 0 < len(self._s): return self._s[-1] else: return None def clear(self): self._s = [] def display(self): print self._s 
S0 = Stack() print S0 print type(S0) # Check out the tab completion S0.display() 
S1 = Stack() S1.push(5) S1.push(3) S1.display() print S1.peek() S1.display() print S1.pop() S1.display() print S1.pop() S1.display() print S1.pop() 
S0.push(3) print S0._s S1 = Stack() S1.push(5) print S0._s print S1._s 

A Queue Class

The following block defines a queue class with two methods - enqueue and dequeue. However, instead of following the Stack class approach and having this own a private deque object, this queue is a subclass of the deque class. So this queue object is a deque. One consequence is that this queue object will inherit the methods of the deque class. This can be confirmed with the tab completion in the block following the definition of the Queue class.

import collections ''' This block defines a Queue class as a subclass of the collections.deque class This class inherits all the methods of the deque class and adds the methods enqueue and dequeue. Daniel D. Warner September 8, 2010 ''' class Queue(collections.deque): def enqueue(self,x): self.appendleft(x) def dequeue(self): if 0 < len(self): return self.pop() else: return None 
q = Queue() q 

Priority Queue

A priority queue is a container where each item is inserted with a certain priority.  When items are removed, the item with the highest priority is the item that is removed.  Priority queues can be implemented using a heap.  A heap is a Complete Binary Tree with a partial ordering.  A complete binary tree is a binary tree where the tree is filled out  from top to bottom, left to right.  To be a heap, the complete binary tree must satisfy the following partial ordering criteria.

The priority of the parent must always be equal to or higher than the priority of its children.  

This is often referred to as the heap property.  The heap property is an invariant which must be true before and after pushing or popping an item.

There is a natural embedding of a complete binary tree into an array using the functions


import itertools from heapq import heapify, heappop, heappush class PriorityQueue(): ''' The arguments passed to a PriorityQueue must consist of a priority and an object. It must be possible to compare priorities using the < operator. ''' def __init__(self): self._pq = [] self._counter = itertools.count(1) def push(self, pri, obj): count = next(self._counter) heappush(self._pq,(pri,count,obj)) def pop(self): if 0<len(self._pq): pri,count,obj = heappop(self._pq) else: pri = -1 obj = None return pri,obj def length(self): return len(self._pq)