The following table gives the frequencies of the letters of the English language (including the blank for separating words) in a particular corpus.

blank

18.3%

r

4.8%

y

1.6%

e

10.2%

d

3.5%

p

1.6%

t

7.7%

l

3.4%

b

1.3%

a

6.8%

c

2.6%

v

0.9%

o

5.9%

u

2.4%

k

0.6%

i

5.8%

m

2.1%

j

0.2%

n

5.5%

w

1.9%

x

0.2%

s

5.1%

f

1.8%

q

0.1%

h

4.9%

g

1.7%

z

0.1%

  1. What is the optimum Huffman encoding of this alphabet?
  2. What is the expected number of bits per letter?
  3. Suppose now that we calculate the entropy of these frequencies

H=t=026ptlog1pt

(see the box in page 143). Would you expect it to be larger or smaller than your answer above? Explain.

d. Do you think that this is the limit of how much English text can be compressed? What features of the English language, besides letters and their frequencies, should a better compression scheme take into account?

Short Answer

Expert verified

In this question we can use different method to convert alphabet letter’s using binary bits pattern and getting answer.

Step by step solution

01

Compression Technique

a)

Huffman encoding is a data compression technique. Assume that the alphabet frequency is as shown in Figure 1. Determine the most efficient Huffman encoding for the alphabets.

Follow the methods outlined below to determine the best Huffman encoding:

• Arrange the alphabets in ascending order of frequency.

• Choose the two alphabets with the lowest frequency.

• Combine them and arrange the results into the frequency list.

• Repeat steps 1-3 until the entire list has been scanned.

Figure 1 depicts this procedure.

Figure 1

02

Explanation least frequent alphabets in parent node

• In, the alphabets z and q are used since they are the least common. These are combined, and the result is assigned to the parent node. Because the result is lower than all of the other wavelengths, it's also positioned before j in the list.

The fresh list will be: result so on.[z,q],x,j,k,v,........so on

• The least common alphabets in STEP 2 comprise result [z,q] and x . As a consequence, combine them and place the outcome inside the parent node. So, result[result [z,q],x ] has now become 0.4 , which in itself is bigger than j's value. Therefore, with in bandwidth list, put it after j. As a result, your new list will look like this:

j, result [result [z,q],x],k,v,...... So on.

• In the j is left node and result[result [z,q],x is right node as j is less than result[result[z,q],x ].

Continue this procedure on until entire list has been scanned.

• Give each left branch a number of 0 and each right branch a number of 1 . Figure 2 depicts the end outcome.

Figure 2:

Start somewhere at parent node and explore until you reach full alphabet, checking the 0s and 1s of the branches you've traversed.

The following are the results for any and all alphabets:

  • blank:101(3bits)
  • e:010 (3bits)
  • t:1000 (4bits)
  • a:1110 (4bits)
  • 0:1100(4bits)
  • i:0111(4bits)
  • n:0110(4bits)
  • s:0011(4bits)
  • h:0001(4bits)
  • r:0000(4bits)
  • d:11111(5 bits)
  • l :11110(5 bits)
  • c:00101(5 bits)
  • u:00100(5 bits)
  • m:100111(6 bits)
  • w:100101(6 bits)
  • f:100100(6 bits)
  • g:110111(6 bits)
  • y:110110(6 bits)
  • p:110101(6 bits)
  • b:110100(6 bits)
  • v:1001100(7 bits)
  • k:10011011(8bits)
  • j:100110100(9 bits)
  • x:1001101011(10 bits)
  • q:10011010101(11 bits)
  • z:10011010100(11 bits)

b)

Suppose the length of bits used for Huffman encoding is Iaand frequency of the letter is pa.

Sum of the frequencies is 101 . Expected number of bits per letter:

Expectednumberofbitsperletter=faIaaA

=1faa[18.3x3+10.2×3+7.7×4+6.8x4+5.9×4+5.8×4+5.5x4+5.1×4+4.9×4+4.8×4+3.5×5+3.4×5+2.6×5+2.4×5+2.1×6+1.9×6+1.8×6+1.7×6+1.6×6+1.6×6+1.3×6+0.9×7+0.6×8+0.2×9+0.2×10+0.1×11+0.1×11)

=1fa[(21.3+30.6+30.8+27.2+23.6+23.2+22+20.4+19.6+19.2+17.5+17+13+12+12.6+11.4+10.8+10.2+9.6+9.6+7.8+6.3+4.8+1.8+2.0+1.1+1.1)=386.5101

Assume, alphabet’s letter use to convert with number of bits base on = 3.83 bits per letter.

03

Conclusion 

In the above question there will be binary calculation conversion number or alphabets in bits per letter. This above calculation proved that bits per letter is can do with simple maths formula. It prove correct answer as above.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

A prefix-free encoding of a finite alphabet Γ assigns each symbol in Γ a binary codeword, such that no codeword is a prefix of another codeword. A prefix-free encoding is minimal if it is not possible to arrive at another prefix-free encoding (of the same symbols) by contracting some of the keywords. For instance, the encoding {0,101} is not minimal since the codeword 101 can be contracted to 1 while still maintaining the prefix-free property.

Show that a minimal prefix-free encoding can be represented by a full binary tree in which each leaf corresponds to a unique element of Γ, whose codeword is generated by the path from the root to that leaf (interpreting a left branch as 0 and a right branch as 1 ).

Consider an undirected graph G=(V,E)with nonnegative edge weights role="math" localid="1658915178951" we0. Suppose that you have computed a minimum spanning tree of G, and that you have also computed shortest paths to all nodes from a particular node role="math" localid="1658915296891" sV. Now suppose each edge weight is increased by 1: the new weights are w0e=we+1.

(a) Does the minimum spanning tree change? Give an example where it changes or prove it cannot change.

(b) Do the shortest paths change? Give an example where they change or prove they cannot change.

Give the state of the disjoint-sets data structure after the following sequence of operations, starting from singleton sets 1,,8. Usepath compression. In the case of ties, always make the lower numbered root point to the higher numbered ones.

union1,2,union3,4,union5,6,union7,8

,union1,4,union6,7,union4,5,find1

In this problem, we will develop a new algorithm for finding minimum spanning trees. It is based upon the following property:

Pick any cycle in the graph, and let e be the heaviest edge in that cycle. Then there is a minimum spanning tree that does not contain e.

(a) Prove this property carefully.

(b) Here is the new MST algorithm. The input is some undirected graph G=(V,E) (in adjacency list format) with edge weights {we}.sort the edges according to their weights for each edge eE, in decreasing order of we:

if e is part of a cycle of G:

G = G - e (that is, remove e from G )

return G , Prove that this algorithm is correct.

(c) On each iteration, the algorithm must check whether there is a cycle containing a specific edge . Give a linear-time algorithm for this task, and justify its correctness.

(d) What is the overall time taken by this algorithm, in terms of |E|? Explain your answer.

Give You are given a graphG=(V,E)with positive edge weights, and a minimum spanning tree T=(V,E)with respect to these weights; you may assume GandTare given as adjacency lists. Now suppose the weight of a particular edge eE'is modified fromw(e)to a new value w'(e). You wish to quickly update the minimum spanning tree T to reflect this change, without recomputing the entire tree from scratch. There are four cases. In each case give a linear-time algorithm for updating the tree.

(a) eE'and w'(e)>w(e) .

(b) role="math" localid="1658907878059" eE'and w'(e)>w(e) .

(c) role="math" localid="1658907882667" eE'and w'(e)>w(e) .

(d) role="math" localid="1658907887400" eE'and w'(e)>w(e) .

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free