These exercises are best solved using a single python script file, which I will call cp3_sol.py (but you can name it whatever you wish).

Will will need access to the functions you wrote for the previous computer problems (cp1_sol.py, cp2_sol.py). You can either copy and paste them into your new file or you can use a python import statement to get access to them in your new file.

I suggest you develop your solutions inside this file, and then run the script in an IPython console. This is accomplished by starting an IPython console, from an command prompt, with the command ipython --pylab.

Then, from the IPython console, you can run your script with the command

%run cp3_sol

As before,yYou will need a wave speed model for these problems. These are provided in msri_tools.py and can be accessed by downloading the file to the directory you will work in. From an ipython --pylab console you can access the model function by executing the command from msri_tools import model. You can also put the line

from msri_tools import model

at the top of your python scripts.

Part 3: The Adjoint Condition, Linear Modeling, and Optimization¶

In this exercise you will solve the linear forward modeling equations and verify that the adjoint conditions hold for your code. The definition of the adjoint of a linear operator states that

\[ \left< F\delta m, d \right> _{\mathcal{D}} = \left< \delta m, F^{*}d \right> _{\mathcal{M}}, \]

where \(\left< \cdot, \cdot \right> _{\mathcal{D}}\) and \(\left< \cdot, \cdot \right> _{\mathcal{M}}\) are inner products in the data and model spaces, respectively. One test to see if your migration operator \(F^*\) is working correctly is to test if this relationship holds to high precision for any pair of \(d\) and \(\delta m\).

Finally, you will implement a simple gradient descent algorithm for minimizing the objective function

\[ J(m) = \frac{1}{2}\int_0^T||d_{x_r}(t) - \mathbf{S}\mathcal{F}[m](t)||_2^2 \textrm{d}t, \]

to 'solve' the seismic imaging problem.

Linear Foward Modeling¶

To implement the adjoint test, you will need to solve the linear modeling equations, which are derived in the notes,

\[ \begin{align*} (\frac{1}{c_0(x)^2}\partial_{tt}-\partial_{xx})u_1(x,t) & = -\delta m(x) \partial_{tt}u_0(x,t), \\\\ (\frac{1}{c_0(x)}\partial_t-\partial_x)u_1(0,t) & = 0, \\\\ (\frac{1}{c_0(x)}\partial_t+\partial_x)u_1(1,t) & = 0, \\\\ u_1(x,t) & = 0 \quad\text{for}\quad t \le 0, \end{align*} \]

where \(u_1\) is the Born scattered field and equivalently \(F\delta m = u_1\).

Problem 1: Write a function linear_sources(dm, u0s, config) that takes an arbitrary model perturbation dm and a time sequence of incident wavefields u0s and generates the linear source wavefields (the right-hand-sides of the linear modeling equations). Functions you have previously written should be useful.

Problem 2: Use your leap_frog function to solve for the linear forward wavefield \(u_1\) due to a perturbation \(\delta m\). Use this code to write a function linear_forward_operator(C0, dm, config) which returns a tuple containing the wavefields and the sampled data.

When sampling is accounted for, the adjoint condition requires that,

\[ \left< \mathbf{S}F\delta m, d \right> _{\mathcal{D}} = \left< \delta m, F^{*}\mathbf{S}^*d \right> _{\mathcal{M}}, \]

hold to high precision.

Problem 3: Verify that the adjoint condition is satisfied by your implementation of the linear forward modeling and migration operators. Be careful to take into account the differences in the model and data inner product. Write a function adjoint_condition(C0, config) that implements this test. How accurate is the relationship? It should be accurate to machine precision for random values of the data \(d\) and the model perturbation \(\delta m\). Be sure that you define \(\delta m(0) = \delta m(1) = 0\), as it is nonphysical to have sources on an absorbing boundary.

Consider modifying your construct_matrices function to accept a config key 'bc', which allows you to toggle between absorbing and homogeneous Dirichlet boundaries. This result should still hold.

In []:

# Setup the problem from before
config = dict()

# Configure the wavelet
config['nu0'] = 10 #Hz

# Physical domain parameters
config['x_limits'] = [0.0, 1.0]
config['nx'] = 201
config['dx'] = (config['x_limits'][1] - config['x_limits'][0]) / (config['nx']-1)

# Source position
config['x_s'] = 0.1

# Receiver position
config['x_r'] = 0.15

# Load the model
from msri_tools import model
C, C0 = model(config)

# Set CFL safety constant
config['alpha'] = 1.0/6.0

# Define time step parameters
config['T'] = 3 #seconds
config['dt'] = config['alpha'] * config['dx'] / C.max()
config['nt'] = int(config['T']/config['dt'])

from wave_imaging import adjoint_condition

print "Absorbing BC"
adjoint_condition(C0, config)

print "Dirichlet BC"
config['bc'] = 'dirichlet'
adjoint_condition(C0, config)

Gradient Based Optimization¶

Full waveform inversion is the geophysics terminology for minimizing the objective

\[ J(m) = \frac{1}{2}\int_0^T||d_{x_r}(t) - \mathbf{S}\mathcal{F}[m]||_2^2 \textrm{d}t. \]

The term arises because we will be using the wave equation simulation to match the seismic data to 'invert' or solve the seismic inverse problem.

From the notes, we have that the gradient of the objective at a point \(m_0\) is

\[ \nabla J[m_0] = -F^*(d_{x_r} - \mathbf{S}\mathcal{F}[m_0] ) = -F^*r. \]

The method of gradient (or steepest) descent attempts to solve the optimization problem using the iteration

\[ m_{k+1} = m_k -\beta_k \nabla J[m_k], \]

where \(\beta_k\) is a step parameter, best found via a 'line search,' that is necessary because the gradient provides no notion of scale. (Note: most texts use \(\alpha\) rather than \(\beta\), but I want to avoid confusion with the \(\alpha\) in the CFL condition.) With a sufficiently sized \(\beta_k\), this method can recover the true modem \(m\) approximately.

Problem 4: Write a function gradient_descent(C0, d, k, config) which takes a starting model C0, 'measured' data d, and a number of iterations k as arguments and returns the sequence of k estimates of the true model, as well as the values of the objective function \(J\) at each of those points. How might you select \(\beta_k\)? How does \(m_k\) compare to \(m\)? Why? How many iterations do you need before you start to recover your model? The model recovery is not perfect. Explain the errors.

Warning!! Be very careful to update the model correctly, as \(\delta m \ne c - c_0 = \delta c\)! How does \(\delta m\) relate to \(\delta c\)?

Plot the convergence of your gradient descent method using a log-scaled plot.

A 1D Seismic Imaging Example¶

Resources and Code Setup¶

Part 3: The Adjoint Condition, Linear Modeling, and Optimization¶

Linear Foward Modeling¶

The Adjoint Test¶

Gradient Based Optimization¶