Introduction to Scientific Python

MSRI: Introduction to the Mathematics of Seismic Imaging

Russell J. Hewett

Department of Mathematics, Massachusetts Institute of Technology

(a quick note on Python 2.7 vs Python 3.x)

Python + Scientific Computing = Numerical Python

Python Interpreters

Python Syntax

Comments

In [1]:
# Comments denoted with hash symbol

# Multi-line comments require
# repeated use of hash symbol

Primitive Data Types: Basics

In [2]:
# Integers
my_int = 7

# floating point: careful with division
my_float = 6.0

# Boolean
my_bool = True # or False
In [3]:
# conventionally used for single words and characters
short_string = 'single-quote'

# conventionally used for longer strings
long_string = "double quotes are often used for longer strings" 

really_long_string = """or triple quotes for long multiline strings,
such as documentation strings or other situations
where you want a string to run on and on for more
than one line"""

raw_string = r"ignores escaped characters like \n"

unicode_string = u"for non-ascii and foreign langauge characters"

Primitive Data Types: Tuples

In [4]:
# Tuples:
#    IMMUTALBE collection of arbitrary items
#    creatied with `comma`
t = (1, 3.0, 'str')

t
Out[4]:
(1, 3.0, 'str')
In [5]:
# Indexed with brackets
t[0]
Out[5]:
1
In [6]:
# Elements can't be changed
t[0] = 4
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-4c4be606a663> in <module>()
      1 # Elements can't be changed
----> 2 t[0] = 4

TypeError: 'tuple' object does not support item assignment

Primitive Data Types: Lists

In [7]:
# Lists:
#    MUTABLE collection of arbitrary items
#    created with brackets []
#    indexed with brackets
mylist = [1, 3.0, 'str']

mylist
Out[7]:
[1, 3.0, 'str']
In [8]:
# Or created with list()
mylist2 = list()
mylist2.append(1)
mylist2.append(3.0)
mylist2.append('str')

mylist2
Out[8]:
[1, 3.0, 'str']
In [9]:
mylist[0] = 7
mylist
Out[9]:
[7, 3.0, 'str']

Primitive Data Types: Dictionaries

In [10]:
# Dictionaries:
#    MUTABLE collection of arbitrary key-item pairs
#    created with braces {}
#    indexed with brackets
mydict = {'key1' : 'item 1', 2 : 'b', False : 'false string'}

mydict
Out[10]:
{False: 'false string', 2: 'b', 'key1': 'item 1'}
In [11]:
# Indexed by key with brackets
mydict['key1'], mydict[False]
Out[11]:
('item 1', 'false string')
In [12]:
# Also created with dict()
mydict2 = dict(a=1, b=2)

# MUTABLE: Items added on the fly
mydict['key4'] = mydict2

mydict
Out[12]:
{False: 'false string', 2: 'b', 'key1': 'item 1', 'key4': {'a': 1, 'b': 2}}

Output to the Screen

In [13]:
# From python or ipython
# type variable name and press return
mylist
Out[13]:
[7, 3.0, 'str']
In [14]:
# Inside scripts
# use the `print` command
print 'Print strings and variables with the `print` "function"\n'

print my_int
print long_string
print mylist
print my_int, my_float, my_bool
Print strings and variables with the `print` "function"

7
double quotes are often used for longer strings
[7, 3.0, 'str']
7 6.0 True

Loading Other Python 'modules': The import statement

In [15]:
# basic import
import numpy
print numpy.sin

# renamed import
import numpy as np
print np.sin

# partial import
from numpy import sin
print sin
<ufunc 'sin'>
<ufunc 'sin'>
<ufunc 'sin'>

Code Blocks and Indentation

Python uses white space to delineate code blocks

!!! Convention is to use four spaces for indented blocks !!!

Flow Control: If-Then-Else

In []:
# The colon signifies the start of a new block
if condition:
    print "Doing first thing"
elif other_condition:
    print "Doing the second thing"
else:
    print "Falling back to the last thing"
In []:
# If statements can be nested
if condition1:
    if condition2:
        print "Nested first thing"
    else:
        print "Nested fall back"

Flow Control: If-Then-Else

In [16]:
# Conditionals
print True and False
print True or False
print True == False
print True != False
print True is False
print True is not False
print 'a' in ['a', 'b', 'c']
False
True
False
True
False
True
True

Flow Control: Looping

In []:
# Python loop syntax is
for item in collection:
    print "Doing task for ", item

Flow Control: Looping

In [17]:
# iterating over numbers
for i in xrange(len(mylist)):
    print mylist[i]
    
print '\n'  
# easier!
for item in mylist:
    print item
    
print '\n'  
# even dictionaries
for key in mydict:
    print key, '-->', mydict[key]
7
3.0
str


7
3.0
str


False --> false string
key1 --> item 1
2 --> b
key4 --> {'a': 1, 'b': 2}

Functions

In [18]:
# Created using the `def` keyword
# More than one function per file! (unlike MATLAB)
def my_function(arg1, arg2, kwarg1=0.0, kwarg2=None):
    
    if kwarg2 is None:
        print arg1 + arg2 * kwarg1
    else:
        print kwarg2
In [19]:
my_function(1, 2, kwarg1=4)
9
In [20]:
my_function(1,2,kwarg1=4, kwarg2="no math!")
no math!

(Some) Advanced Python Syntax

In [21]:
# List comprehensions for fast list creation
slow_list = list()
for i in xrange(5):
    slow_list.append(i**2)
    
fast_list = [i**2 for i in xrange(5)]
In [22]:
# Collecting elements in different lists: `zip`
numbers = [1,2,3]
letters = ['a', 'b', 'c']
pairs = zip(letters, numbers)

print pairs
[('a', 1), ('b', 2), ('c', 3)]
In [23]:
# Unzipping: use `explode` operator *
print zip(*pairs)
[('a', 'b', 'c'), (1, 2, 3)]

Python Standard Library

In [24]:
import os # OS system calls
import os.path # file/directory functions

import sys # other system routines

import time # date-time classes and timing functions

import math # basic math library
# Contains log, exp, sin, cos, pi, etc

IPython Magics

In []:
# my_script.py
def squares(n):
    m = list()
    for i in xrange(n):
        m.append(i**2)        
    return m

if __name__ == '__main__':
    print "Running squares"
    squares_list = squares(7)
    print max(squares_list)
In [25]:
# run %magic
%run my_script.py

squares_list
Running squares
36
Out[25]:
[0, 1, 4, 9, 16, 25, 36]

IPython Magics

In []:
# paste magic
%paste
In [26]:
# timeit magic
%timeit squares(50)
100000 loops, best of 3: 7.17 Ás per loop

NumPy Syntax: Arrays, Matrices, and BLAS

Array Creation (Not Matrices!)

In [27]:
np.zeros((3,2))
Out[27]:
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
In [28]:
np.ones((2,4))
Out[28]:
array([[ 1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.]])
In [29]:
np.array(squares_list)
Out[29]:
array([ 0,  1,  4,  9, 16, 25, 36])

Array Creation (Not Matrices!)

In [30]:
np.arange(4,16,2)
Out[30]:
array([ 4,  6,  8, 10, 12, 14])
In [31]:
np.linspace(0,1,10)
Out[31]:
array([ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
        0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ])
In [32]:
np.linspace(0,1,10,endpoint=False)
Out[32]:
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

Array Properties and Manipulation

In [33]:
A = np.arange(9)
print A
print A.shape
print A.ndim
[0 1 2 3 4 5 6 7 8]
(9L,)
1
In [34]:
B = A.reshape(3,3)
print B
print B.shape
print B.ndim
[[0 1 2]
 [3 4 5]
 [6 7 8]]
(3L, 3L)
2

Arrays and Views

In [35]:
print B.flags
print id(A), id(B.base)
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
111267008 111267008
In [36]:
# Python is zero-indexed!
B[1,1] = 100
print B
[[  0   1   2]
 [  3 100   5]
 [  6   7   8]]
In [37]:
print A
[  0   1   2   3 100   5   6   7   8]

Array Slicing

In [38]:
C = np.linspace(0,1,10,False)
print C
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9]
In [39]:
# Slices start:stop:step
C[3:6]
Out[39]:
array([ 0.3,  0.4,  0.5])
In [40]:
# Slices start:stop:step
C[0:10:2]
Out[40]:
array([ 0. ,  0.2,  0.4,  0.6,  0.8])
In [41]:
# Slices start:stop:step
B[0:2, 1:3]
Out[41]:
array([[  1,   2],
       [100,   5]])

Array Arithmetic

In [42]:
# Scalar arithmetic
C[3:6] + 1 # also -, *, /
Out[42]:
array([ 1.3,  1.4,  1.5])
In [43]:
# Element-wise arithmetic
C[3:6] + np.ones(3) # also -, *, /
Out[43]:
array([ 1.3,  1.4,  1.5])

Array Arithmetic

In [44]:
# * is element-wise product
D = np.random.rand(10)
C*D
Out[44]:
array([ 0.        ,  0.08613175,  0.03537001,  0.05448855,  0.39024725,
        0.39223328,  0.05944392,  0.50940686,  0.05480244,  0.56293597])
In [45]:
# Scalar product/ matvec for ARRAYS
np.dot(C,D)
Out[45]:
2.1450600211528341

Array Arithmetic

In [46]:
# Array-array products with * are broadcast operations
B*C[3:6]
Out[46]:
array([[  0. ,   0.4,   1. ],
       [  0.9,  40. ,   2.5],
       [  1.8,   2.8,   4. ]])
In [47]:
# Matvec with arrays
np.dot(B,C[3:6])
Out[47]:
array([  1.4,  43.4,   8.6])
In [48]:
# Matvec with np.matrix object
B_mat = np.matrix(B)
C_vec = C[3:6].reshape(3,1)
B_mat*C_vec
Out[48]:
matrix([[  1.4],
        [ 43.4],
        [  8.6]])

Array Transpose and Complex Arrays

In [49]:
print B, '\n\n', B.T
[[  0   1   2]
 [  3 100   5]
 [  6   7   8]] 

[[  0   3   6]
 [  1 100   7]
 [  2   5   8]]
In [50]:
E = ones((3,1)) + 1j*np.random.rand(3,1)
print E
print E.real
print E.imag
[[ 1.+0.25383453j]
 [ 1.+0.6129005j ]
 [ 1.+0.25406639j]]
[[ 1.]
 [ 1.]
 [ 1.]]
[[ 0.25383453]
 [ 0.6129005 ]
 [ 0.25406639]]

NumPy Misc.

In [51]:
np.pi
Out[51]:
3.141592653589793
In [52]:
np.sin(C)
Out[52]:
array([ 0.        ,  0.09983342,  0.19866933,  0.29552021,  0.38941834,
        0.47942554,  0.56464247,  0.64421769,  0.71735609,  0.78332691])
In [53]:
np.exp(C)
Out[53]:
array([ 1.        ,  1.10517092,  1.22140276,  1.34985881,  1.4918247 ,
        1.64872127,  1.8221188 ,  2.01375271,  2.22554093,  2.45960311])

SciPy: Scientific Python

Linear Algebra with SciPy

In [54]:
import scipy as sp
import scipy.linalg #avoid numpy.linalg!!!
import scipy.linalg as spla 
In [55]:
# Matrix and vector norms
print E
scipy.linalg.norm(E)
[[ 1.+0.25383453j]
 [ 1.+0.6129005j ]
 [ 1.+0.25406639j]]
Out[55]:
1.8720653639442719

Solving Linear Systems

In [56]:
# Invert a matrix
scipy.linalg.inv(B) # pinv
Out[56]:
array([[-0.6640625 , -0.00520833,  0.16927083],
       [-0.00520833,  0.01041667, -0.00520833],
       [ 0.50260417, -0.00520833,  0.00260417]])
In [57]:
# Solve a linear system
scipy.linalg.solve(B, C[3:6]) #lstsq
Out[57]:
array([-0.11666667,  0.        ,  0.15      ])

Solving Linear Systems: LU Factorizations

In [58]:
# Compute A = PLU
scipy.linalg.lu(B)
Out[58]:
(array([[ 0.,  0.,  1.],
       [ 0.,  1.,  0.],
       [ 1.,  0.,  0.]]),
 array([[ 1.        ,  0.        ,  0.        ],
       [ 0.5       ,  1.        ,  0.        ],
       [ 0.        ,  0.01036269,  1.        ]]),
 array([[  6.        ,   7.        ,   8.        ],
       [  0.        ,  96.5       ,   1.        ],
       [  0.        ,   0.        ,   1.98963731]]))
In [59]:
# To use, use
factors = scipy.linalg.lu_factor(B)
scipy.linalg.lu_solve(factors, C[3:6])
Out[59]:
array([-0.11666667,  0.        ,  0.15      ])

Eigenvalues and SVD

In [60]:
# Eigenvalues
scipy.linalg.eig(B)
Out[60]:
(array([  -1.29254422+0.j,    8.87565261+0.j,  100.41689161+0.j]),
 array([[-0.8394214 ,  0.21292691, -0.01144772],
       [-0.00196583, -0.06051885, -0.99702214],
       [ 0.54347755,  0.97519208, -0.0762614 ]]))
In [61]:
# Singluar Value Decomposition
scipy.linalg.svd(B)
Out[61]:
(array([[-0.01104514,  0.1601406 , -0.98703242],
       [-0.99704458, -0.07681403, -0.00130548],
       [-0.076027  ,  0.98410091,  0.16051574]]),
 array([ 100.46401272,    9.67352178,    1.18537932]),
 array([[-0.03431374, -0.99784679, -0.05589593],
       [ 0.58656645, -0.06539047,  0.80725701],
       [ 0.80917388,  0.00508667, -0.58754724]]))

SciPy: Beyond Linear Algebra

matplotlib: 1D, 2D, and (some) 3D Plotting

Basic 1D Plotting

In [62]:
import matplotlib.pyplot as plt # Basic plotting tools

ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts))
Out[62]:
[<matplotlib.lines.Line2D at 0x6c5d978>]
In [63]:
ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts))
plt.plot(ts,sin(4*ts))
Out[63]:
[<matplotlib.lines.Line2D at 0x6c42390>]
In [64]:
ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts), label='lambda=2pi')
plt.plot(ts,sin(4*ts), label='lambda=0.5pi')
plt.legend()
Out[64]:
<matplotlib.legend.Legend at 0x700b128>
In [65]:
ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts), label=r'$\lambda=2\pi$', linestyle='-.')
plt.plot(ts,sin(4*ts), label=r'$\lambda=\frac{\pi}{2}$')
plt.legend()
Out[65]:
<matplotlib.legend.Legend at 0x72dd748>
In [66]:
ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts), label=r'$\lambda=2\pi$', linestyle='-.')
plt.plot(ts,sin(4*ts), label=r'$\lambda=\frac{\pi}{2}$')
plt.legend()

plt.title('Some Sine Functions')
plt.xlabel(r'$t$')
plt.ylabel('Amplitude')
Out[66]:
<matplotlib.text.Text at 0x77ebd68>
In [71]:
ts = linspace(0,2*pi,1000)

plt.figure()
plt.plot(ts,sin(ts), label=r'$\lambda=2\pi$', linestyle='-.', linewidth=2)
plt.plot(ts,sin(4*ts), label=r'$\lambda=\frac{\pi}{2}$', linewidth=2)
plt.legend(fontsize=18)

plt.title('Some Sine Functions', fontsize=22)
plt.xlabel(r'$t$', fontsize=18)
plt.ylabel('Amplitude', fontsize=18)
Out[71]:
<matplotlib.text.Text at 0x8fd7eb8>

Plotting 2D Data

In [68]:
data = np.random.rand(10,10)

plt.figure()
plt.imshow(data)
Out[68]:
<matplotlib.image.AxesImage at 0x7608eb8>
In [69]:
plt.figure()
plt.imshow(data, interpolation='nearest')
Out[69]:
<matplotlib.image.AxesImage at 0x8d3ec50>
In [70]:
plt.figure()
plt.imshow(data, interpolation='nearest')
plt.colorbar()
Out[70]:
<matplotlib.colorbar.Colorbar instance at 0x0000000008FBD748>

Check them out: matplotlib Gallery

Integral Demo

Arrow Demo

Patch Collection

Polar Bar

3D Contours

Resources