Sequencing by hybridization. One experimental procedure for identifying a new DNA sequence repeatedly probes it to determine which k-mers(substrings of length ) it contains. Based on these, the full sequence must then be reconstructed.Let’s now formulate this as a combinatorial problem. For any string x (the DNA sequence), let Γ(x)denote the multiset of all of its localid="1658905204605" k-mers. In particular, localid="1658904556515" Γ(x)contains exactly |x|-k+1elements.The reconstruction problem is now easy to state: given a multiset of strings, find a string x such that Γ(x)is exactly this multiset.

(a)Show that the reconstruction problem reduces to RUDRATA PATH. (Hint: Construct a directed graph with one node for each localid="1658904858295" k-mers, and with an edge from a to b if the last k-1characters of match the first localid="1658905395287" k-1characters of b.)

(b)But in fact, there is much better news. Show that the same problem also reduces to EULER PATH. (Hint: This time, use one directed edge for each k-mer.)

Short Answer

Expert verified

a) TheSequencingbyHybridizationproblemcanbereducedtoHamiltonianPath.b) TheSequencingbyHybridizationproblemcanbereducedtoEulerPath.

Step by step solution

01

Explain the given information

Consider the information:

The objective is to prove that the reconstruction problem can be reduced to Hamiltonian Path.

The Hamiltonian path or Hamilton path is a path between the any two vertices of the graph that contains all the other vertices exactly once.

02

Reconstructing problem to RUDRATA PATH

(a)

Proof:

This reconstruction problem is reduced to a Hamiltonian Path using a directed graph construction. In a directed graph, each substring of length k or k-mersis represented as the vertices of the graph. In a multiset, if anyk-mersappears more than once then they are represented using multiple vertices.

The edges in the directed graph are constructed such that for each edgea,b, k-1suffix of should be similar tok-1 prefix of b .

Therefore, the Sequencing by Hybridization problem can be reduced to Hamiltonian Path.

03

Reconstructing problem to EULER PATH

(b)

Consider the information:

The objective is to prove that the reconstruction problem can be reduced to Euler Path.

The Euler path of a graph contains all the edges of the graph exactly once but the vertices can be visited multiple times.

Proof:

The reduction of the reconstruction problem to the Euler path gives a linear-time algorithm for the problem. The idea of the reduction is to make the edges of the graph the substrings of length k . Then the path containing every edge only once can be found.

The vertices in this graph are the set of k-1mers..

The vertex is connected to a vertex with a directed edge if the multiset contains such that k-1mers.suffix is same as andk-1 prefix is same as b .The parallel edges are used to represent therole="math" localid="1658905546336" k-mers.appearing more than once.

Therefore, the Sequencing by Hybridization problem can be reduced to Euler Path.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Show that if P=NP then the RSA cryptosystem (Section 1.4.2) can be broken in polynomial time.

We are feeling experimental and want to create a new dish. There are various ingredients we can choose from and we’d like to use as many of them as possible, but some ingredients don’t go well with others. If there arepossible ingredients (numbered 1to n), we write down an matrix giving thediscordbetween any pair of ingredients. Thisdiscordis a real number between 0.0and 1.0, where means “they go together perfectly” and 1.0 means “they really don’t go together.” Here’s an example matrix when there are five possible ingredients.

In this case, ingredients 2and 3go together pretty well whereas1and5clash badly. Notice that this matrix is necessarily symmetric; and that the diagonal entries are always . 0.0Any set of ingredients incurs apenaltywhich isthe sum of all discord values between pairs of ingredients.For instance, the set of ingredients{1,3,5}incurs a penalty of 0.2+1.0+0.5=1.7

.We want this penalty to be small.

EXPERIMENTAL CUISINE

Input:, nthe number of ingredients to choose from D;,the n×n“ discord” matrix; some numberp0

OUTPUT:The maximum number of ingredients we can choose with penalty p.

Show that ifEXPERIMENTAL CUISINEis solvable in polynomial time, then so is 3SAT.

STINGY SAT is the following problem: given a set of clauses (each a disjunction of literals) and an integer K , find a satisfying assignment in which at most K variables are true, if such an assignment exists. Prove that isNP -complete.

Search versus decision. Suppose you have a procedure which runs in polynomial time and tells you whether or not a graph has a Rudrata path. Show that you can use it to develop a polynomial-time algorithm for RUDRATA PATH (which returns the actual path, if it exists).

Consider the CLIQUE problem restricted to graphs in which every vertex has degree at most v. Call this problem CLIQUE-3 .

(a) Prove that CLIQUE-3 is in NP .

(b) What is wrong with the following proof of NP-completeness for CLIQUE-3 ? We know that the CLIQUE problem in general graphs is NP-complete, so it is enough to present a reduction from CLIQUE-3 to CLIQUE . Given a graph G with vertices of degree 3, and a parameter g, the reduction leaves the graph and the parameter unchanged: clearly the output of the reduction is a possible input for the CLIQUE problem. Furthermore, the answer to both problems is identical. This proves the correctness of the reduction and, therefore, the NP-completeness of CLIQUE-3 .

(c) It is true that the VERTEX COVER problem remains NP-complete even when restricted to graphs in which every vertex has degree at most 3 . Call this problem VC-3 . What is wrong with the following proof of NP-completeness for CLIQUE ? We present a reduction from VC-3 to CLIQUE-3 . Given a graph G=(V,E) with node degrees bounded by 3 , and a parameter b , we create an instance of CLIQUE-3 by leaving the graph unchanged and switching the parameter to |V|-b. Now, a subset CVis a vertex cover in G if and only if the complementary set V-C is a clique in G. Therefore G has a vertex cover of sizebif and only if it has a clique of size |V|-b. This proves the correctness of the reduction and, consequently, the NP-completeness of CLIQUE-3 .

(4)Describe an O(V)algorithm for CLIQUE-3 .

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free