Special Topics in Data Science

Algorithms for Computational Biology

Spring 2021

Instructor Dan DeBlasio Time TR 15:00-16:30
Email dfdeblasio _at_ utep.edu Location Online (Zoom, contact instructor for password)
OfficeCCSB 3.1008
online until further notice:
teamsChat.deblasiolab.org
Office Hours:M 2-3pm, R 1-2pm
or by appointment (calendly.deblasiolab.org).
Syllabus updated: January 18 2021
Source code for the syllabus and all homework assignments can be found on github and are licensed under Creative Commons (CC-BY-SA-4.0).

This course will cover the algorithms that make modern computational biology and bioinformatics possible. The plan is to cover both foundational algorithms such as sequence alignment, as well as their modern applications in solving problems such a genome assembly. The focus of this course is on how computer scientists apply their knowledge to frame a computational problem inspired by a specific real-world problem and to solve such computational problems. In addition to standard algorithm development, the course will cover the influence of convex optimization (mainly integer linear programming) and machine learning on computational biology. The course assumes no previous knowledge in biology or genetics. The course will build on and enhance students’ basic understanding of the principle of algorithm design and analysis by applying such principles in the context of bioinformatics.

The topics discussed are likely to include:

CS 2302 is a pre-requisite, please contact the instructor with any questions.

We will use "Algorithms in Bioinformatics" by Wing-Kin Sung[1] as our primary text, but this will be supplemented with other literature soruces that will be provided.

[1] CRC Press, ISBN 9781420070330(Hardcover)/9780367659318(Paperback)

Date Slides Homework Other
19 January 2021 (W1T) Introduction Slides
Algorithm Refresher
21 January 2021 (W1R) Linear Programming Welcome Survey
26 January 2021 (W2T) Biology Primer
28 January 2021 (W2R) Sequence Similarity
2 February 2021 (W3T) Sequence Similarity Continued Homework 1 Example Homework Solution
4 February 2021 (W3R) Sequence Similarity Continued
9 February 2021 (W4T) Parametric Alignment Alignment exercise (solution)
11 February 2021 (W4R) Parametric Alignment
16 February 2021 (W5T) Suffix Trees Homework 2
18 February 2021 (W5R) Suffix Arrays
23 February 2021 (W6T) BWT & FM-Index
25 February 2021 (W6R) Longest Common Subsequence Homework 2 Deadline Extended 2 March Group Activities
2 March 2021 (W7T) Longest Common Subsequence Homework 3
4 March 2021 (W7R) Multiple Sequence Alignment Extra office hours on 5 March 1:30-2:30pm
No office hours 8 March 2021
Alignment Review
9 March 2021 (W8T) Multiple Sequence Alignment Extra office hours on 10 March 11am-12pm
11 March 2021 (W8R) Midterm Exam Practice Midterm (Solution)
23 March 2021 (W9T) Genome Alignment
25 March 2021 (W9R) Genome Alignment Final Project Info
30 March 2021 (W10T) Database Search
1 April 2021 (W10R) Database Search (Continued) Homework 4 (Due 12 April 2021)
6 April 2021 (W11T) Hashing and Sketching
8 April 2021 (W11R) Hashing and Sketching (continued)
12 April 2021 (W12T) Phylogenetics Homework 2 Solution Posted
20 April 2021 (W13T) Read Alignment (Reference-based Alignment) Homework 5 Thursday Office Hours: 12:30p-1:30p
22 April 2021 (W13R) RNA-Seq
27 April 2021 (W14T) Guest "Lecture" on de novo assembly
Feel free to watch the videos in a group on zoom
29 April 2021 (W14R) de novo Assembly Final Project Due on May 11! (Hard deadline)
4 May 2021 (W15T) Alignment Free Genomics
  • Outstanding HW due ASAP
  • Please fill out the course evals (particularly written feedback)
6 May 2021 (W15R) Review

Useful External Links

Slides:

Homework