A subsequence is palindromic if it is the same whether read left to right or right to left. For instance, the sequence

A,C,G,T,G,T,C,A,A,A,A,T,C,G

has many palindromic subsequences, including A,C,G,C,Aand A,A,A,A(on the other hand, the subsequence A,C,Tis not palindromic). Devise an algorithm that takes a sequence X[1...n]and returns the (length of the) longest palindromic subsequence. Its running time should be0(n2).

Short Answer

Expert verified

We say a stringxi....jis a palindrome ifxi=xjorxi+1=xj-1

Since we need to find the Longest Palindromic Subsequence, for this we will be using dynamic programming approach.

Step by step solution

01

Approach

Here we will first define our sub-problem which would serve as recursive relation, and then we would be calling the sub-problem recursively.

In this question, following two possibilities are:

  • If the first and last character of a string are same, include the first and last character in the palindrome and then in same corresponding order, recursively check for other remaining character.

For instance, if a string isxi....j, then checkxi=xj,xi+1=xj-1and so on.

  • If the first character of the string not matches with the last character, then go back to the values we got from by:
  • Either removing first character from xi....j.
  • Or removing the last character from xi....j.
02

Recursive Relation

S(i,j)=1;ifi=j2+S(i+1,j-2);ifi<jandx[i]=x[j]max{Si,j-1,Si+1,j};otherwise

Where Si,jdefine the length of the palindrome.

03

Implementation of Algorithm

fori=2ton+1Si,i-1=0fori=1ton-1Si,i=1form=1ton-1fori=nton-mj=i+mifxi=xjSi,j=2+Si+1,j-1elseSi,j=maxSi,j-1,Si+1,jreturnS1,n

This solution will return us the longest palindrome in the string xi....j.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Exon chaining.Each gene corresponds to a subregion of the overall genome (the DNA sequence); however, part of this region might be “junk DNA.” Frequently, a gene consists of several pieces called exons, which are separated by junk fragments called introns. This complicates the process of identifying genes in a newly sequenced genome.

Suppose we have a new DNA sequence and we want to check whether a certain gene (a string) is present in it. Because we cannot hope that the gene will be a contiguous subsequence, we look for partial matches—fragments of the DNA that are also present in the gene (actually, even these partial matches will be approximate, not perfect). We then attempt to assemble these fragments.

Let x 1Kndenote the DNA sequence. Each partial match can be represented by a triple li,ri,wi, where xliKriis the fragment and is a weight

representing the strength of the match (it might be a local alignment score or some other statistical quantity). Many of these potential matches could be false, so the goal is to find a subset of the triples that are consistent (nonoverlapping) and have a maximum total weight.

Show how to do this efficiently.

You are given a string of n characters s[1...n], which you believe to be a corrupted text document in which all punctuation has vanished (so that it looks something like “itwasthebestoftimes...”). You wish to reconstruct the document using a dictionary, which is available in the form of a Boolean function dict(.): for any string w,

dict(w)={trueifwisavalidwordfalseotherwise

Give a dynamic programming algorithm that determines whether the string s[.]can be reconstituted as a sequence of valid words. The running time should be at mostO(n2) , assuming calls to dict take unit time.

In the event that the string is valid, make your algorithm output the corresponding sequence of words.

Alignment with gap penalties. The alignment algorithm of Exercise 6.26 helps to identify DNA sequences that are close to one another. The discrepancies between these closely matched sequences are often caused by errors in DNA replication. However, a closer look at the biological replication process reveals that the scoring function we considered earlier has a qualitative problem: nature often inserts or removes entire substrings of nucleotides (creating long gaps), rather than editing just one position at a time. Therefore, the penalty for a gap of length 10 should not be 10 times the penalty for a gap of length 1, but something significantly smaller.

Repeat Exercise 6.26, but this time use a modified scoring function in which the penalty for a gap of length k is c0 + c1k, where c0 and c1 are given constants (and c0 is larger than c1).

Given two strings x=x1x2···xnand y=y1y2···ym, we wish to find the length of their longest common substring, that is, the largest k for which there are indices i and j with xixi+1···xi+k-1=yjyj+1···yj+k-1. Show how to do this in time0(mn)

Sequence alignment. When a new gene is discovered, a standard approach to understanding its function is to look through a database of known genes and find close matches. The closeness of two genes is measured by the extent to which they are aligned. To formalize this, think of a gene as being a long string over an alphabet ={A,C,G,T}. Consider two genes (strings) x=ATGCCand y=TACGCA. An alignment of x and y is a way of matching up these two strings by writing them in columns, for instance:

A-T-GCCTA-CGC

Here the “_” indicates a “gap.” The characters of each string must appear in order, and each column must contain a character from at least one of the strings. The score of an alignment is specified by a scoring matrixδof size (+1)×(+1), where the extra row and column are to accommodate gaps. For instance the preceding alignment has the following score:

δ(-T)+δ(A,A)+δ(T,-)+δ(G,G)+δ(C,C)+δ(C,A)

Give a dynamic programming algorithm that takes as input two strings X[1K n] and Y {1K m} and a scoring matrix δand returns the highest-scoring alignment. The running time should be O(mn) .

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free