Syllabus

Computational Molecular Biology, also known as Bioinformatics, applies computational methods to molecular biology. Computation has become essential for biological and bio-medical research to deal with the ever-growing amount of biological data and complexity of biological systems. The class focuses on structural bioinformatics, which refers to the analysis and prediction of the three-dimensional structure of biological macromolecules such as DNA, RNA, and proteins.

Concerning RNA structure, we will discuss fundamental and advanced techniques for its prediction and analysis. Introducing advanced comparative approaches, the class will cover pairwise and multiple sequence alignment. Concerning protein structure, we will study de-novo prediction, homology-based modeling, and ab-initio prediction in lattice models. Going beyond traditional energy minimization, we will look at techniques to predict folding pathways and kinetics of the folding process.

The class addresses grad students, senior and junior undergrads interested in the application of mathematical and computational methods to structural biology and also for biologists and biochemists interested in the algorithmic foundations of the approaches used in this field. The only prerequisite is a basic understanding of algorithms. In particular, necessary biological background will be provided. The evaluation will be based on a final project.

id picture

Image produced with RNAmovies [J. Waldispuehl]


Topics

  • Pairwise and Multiple Alignment
  • RNA Secondary Structure Prediction
  • Comparative RNA Structure Prediction
  • RNA Equilibrium Ensembles
  • Shape Abstraction of RNA Structure
  • RNA Pseudoknot Prediction
  • RNA-RNA Interaction
  • RNA 3D Structure Modeling
  • Stochastic Context-Free Grammars
  • De-novo Prediction of Structural RNA
  • De-novo Protein Structure Prediction
  • Protein Threading
  • Protein-Protein Interaction
  • 3D Lattice Protein Models
  • Predicting Protein Folding-Pathways
  • Modeling of Folding as Markov Process
  • Energy Landscapes
  • Simulation and Exact Folding Kinetics

Course information

Instructor: Sebastian Will

Lectures: TR (each Tuesday and Thursday), 9:30am - 11:00pm in 8-205.

Office hours: By appointment.


Final Project

The final project includes a report and talk at the end of the term. It can consist of studying a paper/topic in depth going beyond the class, implementing or extending an algorithm, or proving theoretical results. In general, students can freely choose a topic.

The length of the report is expected to range between 2 and 4 pages. The document should contain the following sections:

  • Title and abstract (max. 10 lines).
  • Introduction including a brief statement of the motivations and a description of the structure of the document.
  • Methods (describe the algorithms).
  • Results (present the results of experiments).
  • Discussion (your conclusions and opinion).

Talks will be 20 minutes long followed by 10 minutes of open discussion.


Schedule and Material

RSep-08-2011Introduction, Molecular Biology Primer [Slides]
TSep-12-2011Sequence alignment [Slides]
RSep-15-2011Multiple sequence alignment (see slides above)
RSep-20-2011Multiple sequence alignment (see slides above) and Base pair maximization [Slides]
RSep-22-2011RNA loop-based energy and free energy minimization [Slides]
TSep-27-2011Efficient Energy Minimization / Zuker Algorithm (for slides see above)
RSep-29-2011Boltzmann Distribution, Structure Ensembles, and Partition Functions [Slides]
TOct-04-2011Efficient Partition Function / McCaskill-algorithm [Slides]
ROct-06-2011Efficient Base Pair Probabilities [Slides] and Comparative Analysis of RNA
ROct-13-2011Comparative Analysis of RNA I: RNAalifold and Tree Alignment [Slides]
TOct-18-2011Comparative Analysis of RNA II: General Edit Distance Algorithm [Slides]
ROct-20-2011Comparative Analysis of RNA III: MAX-SNP-hardness of GED, 'Plan B': Sankoff-Algorithm [Slides]
TOct-25-2011Comparative Analysis of RNA IV: Simultaneous Alignment and Folding, continued [Slides]
ROct-27-2011 Guest Lecture: Stefan Washietl --- De-novo Prediction of Non-coding RNA [Slides]
TNov-01-2011 RNA Comparison and Special Topics [Slides]
RNov-03-2011 RNA Pseudoknots [Slides]
TNov-08-2011 RNA-RNA Interaction [Slides]
RNov-10-2011 RNA 3D Structure Prediction [Slides]
TNov-15-2011 Protein Structure Prediction
RNov-17-2011 Protein Structure Prediction II [Slides]
TNov-22-2011 HP Protein Structure Prediction [Slides]
TNov-29-2011 Protein Folding Pathways by Probabilistic Roadmapping [Slides]
RDec-01-2011 Kinetics - Energy Landscapes
TDec-06-2011 Kinetics - Modeling and Solving the Folding Process [Slides]
RDec-08-2011 Final Project Talks
TDec-13-2011 Final Project Talks

Text Books

Peter Clote and Rolf Backofen - Computational Molecular Biology: An Introduction
John Wiley & Sons Inc. ISBN: 9780471872511

Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
Cambridge University Press. ISBN: 9780521629713.


Credits

The web-page template and parts of the primer are based in material of Jerome Waldispühl, Dominic Rose, Mathias Möhl, and Rolf Backofen. The RNA slides are partially based on course material of Rolf Backofen. The protein structure prediction slides are partially based on slides by Jerome Waldispühl and Jinbo Xu.