Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Detecting Fraud When working for the Brooklyn district attorney, investigator Robert Burton analyzed the leading digits of the amounts from 784 checks issued by seven suspect companies. The frequencies were found to be 0, 15, 0, 76, 479, 183, 8, 23, and 0, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford’s law, the check amounts appear to result from fraud. Use a 0.01 significance level to test for goodness-of-fit with Benford’s law. Does it appear that the checks are the result of fraud?

Short Answer

Expert verified

There is enough evidence to conclude thatthe observed frequencies are not the same as the frequencies expected from Benford’s law.

Since the observed frequencies differ from the expected frequencies, the check amounts are a result of fraud.

Step by step solution

01

Given information

The frequencies of the different leading digits of the amounts of 784 checks are recorded.

02

Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 0\\{O_2} = 15\;\;\\{O_3} = 0\;\;\\{O_4} = 76\end{aligned}\)

\({O_5} = 479\)

\(\begin{aligned}{c}{O_6} = 183\\{O_7} = 8\;\;\\{O_8} = 23\;\;\\{O_9} = 0\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 0 + 15 + ...... + 0\\ = 784\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the ith digit as given by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\[\begin{aligned}{c}{E_1} = n{p_1}\\ = 784\left( {0.301} \right)\\ = 235.984\end{aligned}\]

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\[\begin{aligned}{c}{E_2} = n{p_2}\\ = 784\left( {0.176} \right)\\ = 137.984\end{aligned}\]

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\[\begin{aligned}{c}{E_3} = n{p_3}\\ = 784\left( {0.125} \right)\\ = 98\end{aligned}\]

4

9.70%

\[\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\]

\[\begin{aligned}{c}{E_4} = n{p_4}\\ = 784\left( {0.097} \right)\\ = 76.048\end{aligned}\]

5

7.90%

\[\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\]

\[\begin{aligned}{c}{E_5} = n{p_5}\\ = 784\left( {0.079} \right)\\ = 61.936\end{aligned}\]

6

6.70%

\[\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\]

\[\begin{aligned}{c}{E_6} = n{p_6}\\ = 784\left( {0.067} \right)\\ = 52.528\end{aligned}\]

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\[\begin{aligned}{c}{E_7} = n{p_7}\\ = 784\left( {0.058} \right)\\ = 45.472\end{aligned}\]

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\[\begin{aligned}{c}{E_8} = n{p_8}\\ = 784\left( {0.051} \right)\\ = 39.984\end{aligned}\]

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\[\begin{aligned}{c}{E_9} = n{p_9}\\ = 784\left( {0.046} \right)\\ = 36.064\end{aligned}\]

Since the expected values are larger than 5, the requirements of the test are met.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies are not the same as the frequencies expected from Benford’s law.

04

Conduct the hypothesis

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

0

235.984

-235.984

235.984

2

15

137.984

-122.984

109.6146

3

0

98

-98

98

4

76

76.048

-0.048

0.00003

5

479

61.936

417.064

2808.421

6

183

52.528

130.472

324.0737

7

8

45.472

-37.472

30.87946

8

23

39.984

-16.984

7.214292

9

0

36.064

-36.064

36.064

The value of the test statistic is equal to:

\[\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \\ = 235.984 + 109.6146 + ....... + 36.064\\ = 3650.251\end{aligned}\]

Thus,\({\chi ^2} = 3650.251\).

Let k be the number of digits, which is 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the decision

The chi-square table is used to obtain the critical value of\({\chi ^2}\)at\(\alpha = 0.01\)with 8 degrees of freedom is equal to 20.090.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 3650.251} \right)\\ = 0.000\end{aligned}\)

Since the test statistic value is greater than the critical value and the p-value is less than 0.01, the null hypothesis is rejected.

06

State the conclusion

There is enough evidence to conclude thatthe observed frequencies are not the same as the frequencies expected from Benford’s law.

Since the observed frequencies differ from the expected frequencies, the check amounts are a result of fraud.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and, or critical value, and state the conclusion.

Baseball Player Births In his book Outliers, author Malcolm Gladwell argues that more baseball players have birth dates in the months immediately following July 31, because that was the age cutoff date for nonschool baseball leagues. Here is a sample of frequency counts of months of birth dates of American-born Major League Baseball players starting with January: 387, 329, 366, 344, 336, 313, 313, 503, 421, 434, 398, 371. Using a 0.05 significance level, is there sufficient evidence to warrant rejection of the claim that American-born Major League Baseball players are born in different months with the same frequency? Do the sample values appear to support Gladwell’s claim?

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and , or critical value, and state the conclusion.

Testing a Slot Machine The author purchased a slot machine (Bally Model 809) and tested it by playing it 1197 times. There are 10 different categories of outcomes, including no win, win jackpot, win with three bells, and so on. When testing the claim that the observed outcomes agree with the expected frequencies, the author obtained a test statistic of\({\chi ^2} = 8.185\). Use a 0.05 significance level to test the claim that the actual outcomes agree with the expected frequencies. Does the slot machine appear to be functioning as expected?

Chocolate and Happiness In a survey sponsored by the Lindt chocolate company, 1708 women were surveyed and 85% of them said that chocolate made them happier.

a. Is there anything potentially wrong with this survey?

b. Of the 1708 women surveyed, what is the number of them who said that chocolate made them happier?

Exercises 1–5 refer to the sample data in the following table, which summarizes the last digits of the heights (cm) of 300 randomly selected subjects (from Data Set 1 “Body Data” in Appendix B). Assume that we want to use a 0.05 significance level to test the claim that the data are from a population having the property that the last digits are all equally likely.

Last Digit

0

1

2

3

4

5

6

7

8

9

Frequency

30

35

24

25

35

36

37

27

27

24

When testing the claim in Exercise 1, what are the observed and expected frequencies for the last digit of 7?

The table below shows results since 2006 of challenged referee calls in the U.S. Open. Use a 0.05 significance level to test the claim that the gender of the tennis player is independent of whether the call is overturned. Do players of either gender appear to be better at challenging calls?

Was the Challenge to the Call Successful?


Yes

No

Men

161

376

Women

68

152

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free