ELE 4120 Bioinformatics Tutorial 1

ELE 4120 Bioinformatics Tutorial 1

-

English
21 Pages
Read
Download
Downloading requires you to have access to the YouScribe library
Learn all about the services we offer

Description

ELE 4120 BioinformaticsTutorial 4 Content• Dynamic Programming– Global Alignment (Needleman and WunschAlgorithm)– Semiglobal Alignment– Local Alignment (Smith-Waterman Algorithm)Sequence Alignment• When the scoring method is defined, the best alignment can be found1. Exhaustive search – calculate the score of all possible alignments. This method is not feasible for modest length sequence, because there are huge number of possibilities.2. Dynamic Programming - an optimization method that can be applied to sequence alignment problem Dynamic Programming (1)• For example, we want to align the following 2 sequences: A C T CA T C• There are 3 possibilities for the first position in the alignment– 1. Add gap in the 1st sequence– 2. Add gap in the 2nd sequence– 3. not place gap in both sequences1st position Score Remaining Sequence---1 ACTCATCA-1 CTC--ATCA1 CTCATCDynamic Programming (2)• If we know the best 1st Score Remaining Sequencealignment of the remaining positionsequences, we can know ---1 ACTCthe best alignmentATC• Therefore, the problem A-1 CTCbecomes:--ATCA1 CTCAWhat is the best TCalignment of these sequences?Dynamic programming compute the optimal alignment by filling in a table of partial sequence alignment score in order to avoid recomputing. Global Alignment (1)• Example: Compute the best alignment for the following two sequences: (match score = 1, mismatch score = 0, gap penalty = -1)A C A C TA ...

Subjects

Informations

Published by
Reads 16
Language English
Report a problem
ELE 4120 Bioinformatics
 
Tutorial 4
Content
Dynamic Programming
 Global Alignment (Needleman and Wunsch Algorithm)
 Semiglobal Alignment
 Local Alignment (Smith-Waterman Algorithm)
Sequence Alignment
When the scoring method is defined, the best alignment can be found 1. Exhaustive search  calculate the score of all possible alignments. This method is not feasible for modest length sequence, because there are huge number of possibilities. 2. Dynamic Programming - an optimization method that can be applied to sequence alignment problem
Dynamic Programming (1)
For example, we want to align the following 2 sequences:
A C T C A T C There are 3 possibilities for the first position in the alignment  1. Add gap in the 1st sequence  2. Add gap in the 2nd sequence  3. not place gap in both sequences 1st position Score ---1 A A -1 --A 1 A
Remaining Sequence ACTC TC CTC ATC CTC TC
Dynamic Programming (2)
1st position --A A --A A
Score Remaining Sequence
-1 -1 1
ACTC TC CTC ATC CTC TC
 If we know the best alignment of the remaining sequences, we can know the best alignment  Therefore, the problem becomes:
What is the best alignment of these sequences?
Dynamic programming compute the optimal alignment by filling in a table of partial sequence alignment score in order to avoid recomputing.
Global Alignment (1)
Example: Compute the best alignment for the following two sequences: (match score = 1, mismatch score = 0, gap penalty = -1)
A C A C T A C T
Step 1: Create a table similar to the following A C A C T
Sequence 2
A
C
T
0 -1 -2 -3 -4 -5 -1
-1
-2
-3
Sequence 1
Gap penalty
Global Alignment (2)
Step 2:
Start from this position
C
T
A C
0 -1 -2 -
-2
-3
 There are 3 kinds of movement:
A
-3
1. Vertical move: add gap in top sequence
C
-4
2. Horizontal move: add gap in left sequence
T
-5
3. Diagonal move: align nucleotides from each sequence
Global Alignment (3)  Vertical move: A C A C T Upper value + gap penalty 0 - -2 -3 -4 -5  i.e. (-1) -1 = -2 A -1  Horizontal move: Left value + gap penalty C -2  i.e. (-1) -1 = -2 T -3  Diagonal move: Top left value + match / mismatch score for nucleotides at the axes  i.e. the first nucleotides of both A C A C T sequences are identical, so match score =1 0 -1 -2 -3 -4 -5 Î 0 + 1 = 1 A -1 1  Fill in the maximum value of the three C -2 in the cell T -3  i.e. Max = 1 (diagonal move)
Global Alignment (4)
Step 3: Repeat step 2 along row 2
A
C
T
0
-1
A
-1
1
C
-2
0
A
-3
-1
C
-4
-2
T
-5
-3
Global Alignment (5)
Step 4: Fill in the table row by row
A
C
T
0
-1
-2
-3
A
-1
1
0
-1
C
-2
0
2
1
A
-3
-1
1
2
C
-4
-2
0
1
T
-5
-3
-1
1
-2
-1
0
-3
Global Alignment (6) Step 5: Reconstruct the optimal alignment from the lower rightmost entry. Create the path by moving to position that could legally produce the score Value of the lower rightmost entry is the score of the optimal alignment In this example, there are two possible paths. One of them is reconstructed as following A
-5
-4
-3
T
C
A
C
-1
-3
T
C
-2
0
2
1
0
-1
A
-1
1
0
-1
-2
 llagimnnet
1
1
1
2
thf  orematiope ocS