Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Author’s Computer Files The author recorded the leading digits of the sizes of the electronic document files for the current edition of this book. The leading digits have frequencies of 55, 25, 17, 24, 18, 12, 12, 3, and 4 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law.

Short Answer

Expert verified

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Step by step solution

01

Given information

The frequencies of the different leading digits from IRS tax files are recorded.

02

Step 2:Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 55\\{O_2} = 25\\{O_3} = 17\;\;\\{O_4} = 24\end{aligned}\)

\({O_5} = 18\)

\(\begin{aligned}{c}{O_6} = 12\\{O_7} = 12\;\;\\{O_8} = 3\;\;\\{O_9} = 4\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 55 + 25 + ... + 4\\ = 170\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the i-th digit as given by Benford’s law.

Leading Digits

Benford's Law: Distribution of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\(\begin{aligned}{c}{E_1} = n{p_1}\\ = 170\left( {0.301} \right)\\ = 51.17\end{aligned}\)

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\(\begin{aligned}{c}{E_2} = n{p_2}\\ = 170\left( {0.176} \right)\\ = 29.90\end{aligned}\)

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\(\begin{aligned}{c}{E_3} = n{p_3}\\ = 170\left( {0.125} \right)\\ = 21.25\end{aligned}\)

4

9.70%

\(\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\)

\(\begin{aligned}{c}{E_4} = n{p_4}\\ = 170\left( {0.097} \right)\\ = 16.49\end{aligned}\)

5

7.90%

\(\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\)

\(\begin{aligned}{c}{E_5} = n{p_5}\\ = 170\left( {0.079} \right)\\ = 13.43\end{aligned}\)

6

6.70%

\(\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\)

\(\begin{aligned}{c}{E_6} = n{p_6}\\ = 170\left( {0.067} \right)\\ = 11.39\end{aligned}\)

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\(\begin{aligned}{c}{E_7} = n{p_7}\\ = 170\left( {0.058} \right)\\ = 9.86\end{aligned}\)

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\(\begin{aligned}{c}{E_8} = n{p_8}\\ = 170\left( {0.051} \right)\\ = 8.67\end{aligned}\)

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\(\begin{aligned}{c}{E_9} = n{p_9}\\ = 170\left( {0.046} \right)\\ = 7.82\end{aligned}\)

As all the expected values are higher than 5, the requirements of the test are satisfied.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies of leading digits are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies of leading digits are not the same as the frequencies expected from Benford’s law.

The test is right-tailed.

If the absolute value of the test statistic is greater than the critical value, the null hypothesis is rejected.

04

Conduct the hypothesis test

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

55

51.17

3.83

0.286670

2

25

29.92

-4.92

0.809037

3

17

21.25

-4.25

0.850000

4

24

16.49

7.51

3.420261

5

18

13.43

4.57

1.555093

6

12

11.39

0.61

0.032669

7

12

9.86

2.14

0.464462

8

3

8.67

-5.67

3.708062

9

4

7.82

-3.82

1.866036

The value of the test statistic is equal to:

\(\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \;\\ = 0.28667 + 0.809037 + ... + 1.866036\\ = 12.992\end{aligned}\)

Thus,\({\chi ^2} = 12.992\).

Let k be the number of digits, equal to 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the conclusion

The critical value of\({\chi ^2}\)at\(\alpha = 0.05\)with 8 degrees of freedom is equal to 15.507, taken from the chi-square table.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 12.992} \right)\\ = 0.112\end{aligned}\)

Since the test statistic value is less than the critical value and the p-value is greater than 0.05, the null hypothesis is failed to be rejected.

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Motor Vehicle Fatalities The table below lists motor vehicle fatalities by day of the week for a recent year (based on data from the Insurance Institute for Highway Safety). Use a 0.01 significance level to test the claim that auto fatalities occur on the different days of the week with the same frequency. Provide an explanation for the results.

Day

Sun.

Mon.

Tues.

Wed.

Thurs.

Fri.

Sat.

Frequency

5304

4002

4082

4010

4268

5068

5985

Equivalent Tests A\({\chi ^2}\)test involving a 2\( \times \)2 table is equivalent to the test for the differencebetween two proportions, as described in Section 9-1. Using the claim and table inExercise 9 “Four Quarters the Same as $1?” verify that the\({\chi ^2}\)test statistic and the zteststatistic (found from the test of equality of two proportions) are related as follows:\({z^2}\)=\({\chi ^2}\).

Also show that the critical values have that same relationship.

Mendelian Genetics Experiments are conducted with hybrids of two types of peas. If the offspring follow Mendel’s theory of inheritance, the seeds that are produced are yellow smooth, green smooth, yellow wrinkled, and green wrinkled, and they should occur in the ratio of 9:3:3:1, respectively. An experiment is designed to test Mendel’s theory, with the result that the offspring seeds consist of 307 that are yellow smooth, 77 that are green smooth, 98 that are yellow wrinkled, and 18 that are green wrinkled. Use a 0.05 significance level to test the claim that the results contradict Mendel’s theory.

The accompanying TI-83/84 Plus calculator display results from thehypothesis test described in Exercise 1. Assume that the hypothesis test requirements are allsatisfied. Identify the test statistic and the P-value (expressed in standard form and rounded tothree decimal places), and then state the conclusion about the null hypothesis.

The table below shows results since 2006 of challenged referee calls in the U.S. Open. Use a 0.05 significance level to test the claim that the gender of the tennis player is independent of whether the call is overturned. Do players of either gender appear to be better at challenging calls?

Was the Challenge to the Call Successful?


Yes

No

Men

161

376

Women

68

152

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free