Chapter 6: Q28E (page 198)

Local sequence alignment. Often two DNA sequences are significantly different, but contain regions that are very similar and are highly conserved. Design an algorithm that takes an input two strings $x [1 K n]$ and $y [1 K m]$ and a scoring matrix $δ$ (as defined in Exercise 6.26), and outputs substrings $x' a n d y'$ of x and y respectively, that have the highest-scoring alignment over all pairs of such substrings. Your algorithm should take time $O (m n)$ .

Short Answer

Expert verified

The complexity of the program is $O (m n)$

Step by step solution

Local Sequence Alignment

The total scoring alignment is the cost of editing the strings using insertion, deletion, or gap penalties.

Suppose the given strings are $x = x_{1}, x_{2}, K x_{n} a n d y = y_{1}, y_{2}, K y_{m}$ .

The first step is to determine the similarity score of the elements $δ (a, b)$ and the gap penalty of length k.

In the next step the first row and column of the scoring matrix M of size role="math" localid="1658918665081" $(n + 1)$ times $(m + 1)$ to zero Then in the next step, the scoring matrix is filled.

Then tracing back from the highest score to zero in the scoring matrix gives the best alignment.

Step 2:Give Algorithm

Algorithm:

The algorithm can be written as given below:

$δ (a, b)$ - score

$G_{k}$ - gap penalty of length k

$M [n + 1] [m + 1]$ - scoring matrix

for $o = 0 t o n$

$M [0] [1] = 0$

for o=0 to m

$M [1] [0] = 0$

for o=1 to n

for p=1 to n

$M [o] [p] = m a x [1] M [o - 1] [p - 1] + δ (a_{o}, b_{p}) m a x \{k_{1}, M [o - k] [p] - G_{k}\} m a x \{l_{1}, M [o] [p - 1] - G_{k}\}$

Traceback from highest alignment score to 0.

Step 3:Explain Algorithm

Explanation:

$M [o] [p]$ is a optimal score of aligning.

There are only a polynomial number of subproblems.

Every subproblems can be solved easily by solving smaller subproblems.

See, in step-7 we have three cases. first case is $x_{o} = y_{p}$ second case is, $x_{o}$ aligns to a gap and, third case is $y_{p}$ aligns to a gap.

The calculated scoring matrix is of size

So, the complexity of the program is $O (n m) o =$

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Short Answer

Step by step solution

Local Sequence Alignment

Step 2:Give Algorithm

Step 3:Explain Algorithm

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Computer Science Textbooks

Data Structures

Data Representation in Computer Science

Computer Systems

Game Design in Computer Science

Databases

Computer Programming

Study anywhere. Anytime. Across all devices.