Time and space complexity of dynamic programming. Our dynamic programming algorithm for computing the edit distance between strings of length m and n creates a table of size n×mand therefore needs O (mn) time and space. In practice, it will run out of space long before it runs out of time. How can this space requirement be reduced?

  1. Show that if we just want to compute the value of the edit distance (rather than the optimal sequence of edits), then only O(n) space is needed, because only a small portion of the table needs to be maintained at any given time.
  2. Now suppose that we also want the optimal sequence of edits. As we saw earlier, this problem can be recast in terms of a corresponding grid-shaped dag, in which the goal is to find the optimal path from node (0,0) to node (n,m). It will be convenient to work with this formulation, and while we’re talking about convenience, we might as well also assume that is a power of 2.
    Let’s start with a small addition to the edit distance algorithm that will turn out to be very useful. The optimal path in the dag must pass through an intermediate node (k,m2) for some k; show how the algorithm can be modified to also return this value k.
  3. Now consider a recursive scheme:
    Procedure find-path((0,0)(n,m))
    Compute the value kabove
    find-path ((0,0)k,m2)
    find-path k,m2n,m
    concatenate these two paths, with kin the middle.
    Show that this scheme can be made to run inO (mn) time and O(n) space.

Short Answer

Expert verified

This space can be reduced by calculating the values of the adjacent cells in the table and the space complexity will be reduced to O(n).

  1. By considering the value of the adjacent square, the edit distance value can be computed in the given space complexity.
  2. Yes, the algorithm can be modified as follows,
    K(i,0)=iifj>n2ifE(i,j)=E(i-1,j)+1ki,j-n2=Ki-1,j-n2elseifE(i,j)=E(i,j-1)+1Ki,j-n2=Ki,j-n2-1ElseifE(i,j)=E(i-1,j-1)+diff(i,j)Ki,j-n2=Ki-1,j-n2-1
    this scheme can be made to run in O(mn) time and O(n) space.
  3. The given scheme can be made to run in O(mn) time and O(n) space because of the size and the difficulty of the algorithm.

Step by step solution

01

Explain space complexity and how the space requirement can be reduced.

Space Complexity for any algorithm defines the total space that is required or used by that algorithm, for different sizes of the input.

To create any variable there required some space for the algorithm to run. All the space that is required by the algorithm is collectively known as the Space Complexity of the algorithm.

In the dynamic programming algorithm for computing the edit distance between strings of length m and n creates a table of size and therefore needs O(mn) time and space. In practice, it will run out of space long before it runs out of time. This space can be reduced by calculating the values of the adjacent cells in the table and the space complexity will be reduced to O(n).

02

Show that the value of the edit distance can be computed.

(a)

Consider the Figure 6.4 in the text book, the calculation of a certain cell only needs the values of the upper left, upper and left cells of the cell. Considering the values of the adjacent cells in three directions, the current row and the previous row will be saved.

The state values are enough and they can be stored in a rolling array and the space complexity is O(n).

Therefore, By considering the value of the adjacent square, the edit distance value can be computed in the given space complexity.

03

Show how the algorithm can be modified to also return this value k.

(b)

Consider the edit-distance algorithm to calculate the path from the node (0,0) to node (n,m). The optimal path in the dag must pass through an intermediate node k,m2 for some k. In the algorithm swap n and m, which means represents rows and n represents listed rows. The rolling array K with n2+1 columns to store the values of k.

The modified edit-distance algorithm is as follows.

K(i,0)=iifj>n2ifE(i,j)=Ei-1,j+1Ki,j-n2=Ki-1,j-n2elseifE(i,j)=E(i,j-1)+1Ki,j-n2=Ki-1,j-n2-1ElseifE(i,j)=E(i-1,j-1)+diff(i,j)Ki,j-n2=Ki-1,j-n2-1

04

Show that the given scheme can be made to run in O(mn) time and O(n) space.

(c)

Consider that the given scheme is the recurrence relation, the time complexity is as follows,

T(m,n)=O(mn)+12O(mn)+14O(mn)+18O(mn)...=O(mn)

The time complexity is still O(mn), the constant is due to that the size of the algorithm is doubles and the difficulty of the algorithm implementation has also been greatly improved.

Therefore, it has been shown that the given scheme can be made to run in O(mn) time and O(n) space.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

You are given a convex polygon P on n vertices in the plane (specified by their x and y coordinates). A triangulation of P is a collection of n-3diagonals of such that no two diagonals intersect (except possibly at their endpoints). Notice that a triangulation splits the polygon’s interior into n-2 disjoint triangles. The cost of a triangulation is the sum of the lengths of the diagonals in it. Give an efficient algorithm for finding a triangulation of minimum cost. (Hint: Label the vertices of P by 1,....,n, starting from an arbitrary vertex and walking clockwise. For 1i<jn, let the subproblem A(i,j)denote the minimum cost triangulation of the polygon spanned by vertices i,i+1,...,j.).

The garage sale problem (courtesy of Professor Lofti Zadeh). On a given Sunday morning, there are n garage sales going on, g1,g2,g3............gn. For each garage sale gj, you have an estimate of its value to you, vj. For any two garage sales you have an estimate of the transportation cost dijof getting from gito gj. You are also given the costs d0jand dj0of going between your home and each garage sale. You want to find a tour of a subset of the given garage sales, starting and ending at home, that maximizes your total benefit minus your total transportation costs. Give an algorithm that solves this problem in time O(n22n).

(Hint: This is closely related to the traveling salesman problem.)

Alignment with gap penalties. The alignment algorithm of Exercise 6.26 helps to identify DNA sequences that are close to one another. The discrepancies between these closely matched sequences are often caused by errors in DNA replication. However, a closer look at the biological replication process reveals that the scoring function we considered earlier has a qualitative problem: nature often inserts or removes entire substrings of nucleotides (creating long gaps), rather than editing just one position at a time. Therefore, the penalty for a gap of length 10 should not be 10 times the penalty for a gap of length 1, but something significantly smaller.

Repeat Exercise 6.26, but this time use a modified scoring function in which the penalty for a gap of length k is c0 + c1k, where c0 and c1 are given constants (and c0 is larger than c1).

You are given a string of n characters s[1...n], which you believe to be a corrupted text document in which all punctuation has vanished (so that it looks something like “itwasthebestoftimes...”). You wish to reconstruct the document using a dictionary, which is available in the form of a Boolean function dict(.): for any string w,

dict(w)={trueifwisavalidwordfalseotherwise

Give a dynamic programming algorithm that determines whether the string s[.]can be reconstituted as a sequence of valid words. The running time should be at mostO(n2) , assuming calls to dict take unit time.

In the event that the string is valid, make your algorithm output the corresponding sequence of words.

Exon chaining.Each gene corresponds to a subregion of the overall genome (the DNA sequence); however, part of this region might be “junk DNA.” Frequently, a gene consists of several pieces called exons, which are separated by junk fragments called introns. This complicates the process of identifying genes in a newly sequenced genome.

Suppose we have a new DNA sequence and we want to check whether a certain gene (a string) is present in it. Because we cannot hope that the gene will be a contiguous subsequence, we look for partial matches—fragments of the DNA that are also present in the gene (actually, even these partial matches will be approximate, not perfect). We then attempt to assemble these fragments.

Let x 1Kndenote the DNA sequence. Each partial match can be represented by a triple li,ri,wi, where xliKriis the fragment and is a weight

representing the strength of the match (it might be a local alignment score or some other statistical quantity). Many of these potential matches could be false, so the goal is to find a subset of the triples that are consistent (nonoverlapping) and have a maximum total weight.

Show how to do this efficiently.

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free