4
String alignment
instructor: Ross A. Lippert
http://www-math.mit.edu/~lippert/18.417/
For class:- modification to the schedule
- we have a grader/TA
| | | g | a | c | t | t | a | a | g | a | c | t | t | t | a | t | | a | | * | | | | * | * | | * | | | | | * | | | c | | | * | | | | | | | * | | | | | | | c | | | * | | | | | | | * | | | | | | | a | | * | | | | * | * | | * | | | | | * | | | a | | * | | | | * | * | | * | | | | | * | | | g | * | | | | | | | * | | | | | | | | | g | * | | | | | | | * | | | | | | | | | g | * | | | | | | | * | | | | | | | | | t | | | | * | * | | | | | | * | * | * | | * | | t | | | | * | * | | | | | | * | * | * | | * |
|
Making change

Noticing a recurrence
A solution to the change problem can be expressed in terms ofsimpler solutions.

Making Change Recursively
In pseudocode

Just one problem: still O(M^d) !!!
A look at the Change search tree:
For 77 by (1,3,7)

Solution: store intermediate results


This is dynamic programming
Files of strings (FASTA format)
FASTA (fasta-A) format is ubiquitous
>sequence_id comments
atgc...
atgc...
...
>sequence_id comments
atgc...
atgc...
...
...
The initial lines are called deflines and sometimes get used to store all sorts of extra information.
Some programs impose or expect line limits on the sequence lines
Alignments
An example alignment of ATCTGATG and TGCATAC


Global Alignment by Recurrence

I found a cute demo at this site
Variation 1: Longest common subsequence


Variation 2: Sparse alignments

The problem:
- Input: (x1,y1,Score1),(x2,y2,Score2)... regions of high similarity.
- Output: a subset where xi < xi+1 and yi < yi+1 with maximum score
Score = [Scores] - [Gaps]