Benford's law is a mathematical formula that determines the probability of data sets, including leading digit sequences. Benford's law is an example of a type of counterintuitive law. Some authors linked it to Newton's law of gravitation, stating that it is a simpler observation of reality than a mathematical result that can be demonstrated. Contrary to common sense, this law claims that lower first significant digits, often known as leading digits, occur more often than higher ones in natural occurrences. Under extremely generic settings, numbers are predicted to adhere naturally to the postulated pattern of digits. In addition, any variation from the Benford distribution could indicate an external alteration of the expected pattern resulting from data manipulation or fraud. Benford's law can be utilized as a forensic accounting and auditing tool for financial data, as popularized by Nigrini's works among scholars and accounting professionals. Since then, it has been utilized as an advanced statistical tool for detecting fraud. In 1881, astronomer and mathematician Simon Newcomb published in the American Journal of Mathematics the first known article describing what is now known as Benford's law. He discovered that the opening pages of library copies of logarithms books, which dealt with low digits, were significantly more worn than the pages with higher digits. This pattern led him to conclude that fellow scientists used those tables to look up numbers beginning with the numeral one more frequently than those beginning with the numerals two, three, till nine.

The obvious conclusion was that more numbers exist which begin with the numeral one than with larger numbers. Newcomb calculated that the probability that a number has any particular non-zero first digit is:

Where d is a number 1,2,3,4,5,6,7,8, or 9

and P is the probability of its distribution.

Using his formula, the probability that the first digit of a number is one is about 30 percent while the probability tat the first digit is nine is only 4.6 percent. Table 1 below shows the expected frequencies for all digits 0 through 9.

Table 1

Expected frequency of digits

Digit | Expected frequency leading digit |
---|

1 | 0.30103 |

2 | 0.17609 |

3 | 0.12494 |

4 | 0.09691 |

5 | 0.07918 |

6 | 0.06695 |

7 | 0.05799 |

8 | 0.05115 |

9 | 0.04576 |

Despite its extensive history, the mathematical and statistical issues posed by Benford's law have only been acknowledged. From a mathematical standpoint, appropriate law variants arise in integer sequences, such as the famed Fibonacci or factorial sequences. The law applies to floating-point arithmetic as well. Benford's law can now be utilised as a forensic accounting and auditing tool for financial data, largely thanks to the work of Nigrini. The law is a helpful starting point for forensic accountants and applies in various auditing scenarios, including external, internal, and government auditing. It has also successfully identified malfeasance in other areas, including electoral data, campaign funding, and economic data abnormalities. Benford hypothesised that he looked up the logarithms of lower-level numbers more often because there were more numbers with lower-level numerals in the world. In contrast to Newcomb, Benford investigated if his hypothesis was confirmed. He analysed data from numerous geographical, scientific, and demographic sources. His data included figures from 20 different lists, including around 20,229 observations. Benford's discovery was intriguing. He discovered that the leading digit number frequency distributions of various databases, such as city populations, death rates, and river drainage, were logarithmic rather than arithmetic. The leading digit is the number's initial digit. For example, New Zealand's population is 4,746,880 million. Therefore, New Zealand's population's leading or initial digit is 4. Due to the logarithmic distribution of the leading digits of integers, lower numbers such as 1 and 2 occur more frequently than more significant numbers such as 8 and 9. This result contradicts the intuitive notion that each numeral should occur in around 11% of occurrences (one out of nine possible numbers). Consequently, larger digits such as the number 9 are less likely to occur than earlier digits and do not occur 11% as frequently as expected. Given the high cost associated with fraud, accounting professionals can use a reliable and efficient method to apply Benford's Law to flag questionable accounts and transactions. Therefore, this work aims to give a structured methodology for applying Benford's law using Python, which accounting professionals may use to quickly flag accounts and transactions requiring additional inquiry. Nigrini demonstrated in 1994 that Benford's law might be utilised to detect deception or fraud. His research is based on the observation that individuals generate fake figures due to the aforementioned psychological conditions. He is also presumed to be the first researcher to completely implement and test Benford's law in financial statements to detect potential fraud. Hill noted in 1996 that the accuracy of Benford's law in detecting accounting errors is questionable because it produces numerous false positives. This means that specific misleading results could prompt additional costly inquiries. Nigrini supports his definition by arguing that the digital analysis of Benford's rule provides a solid foundation for distinguishing suspicious data with a high degree of manipulation from data with a meagre chance of manipulation, which is crucial for subsequent analyses. Benford's law has gained prominence in auditing and forensic accounting over time. Benford's law was not acknowledged as a forensic accounting tool for detecting suspected fraud until 1990. Today, Benford's law, as an analytical approach, is one of the most widely used digital procedures and provides a distinctive method for data analysis. Benford's law enables forensic accountants to uncover accounting data fraud, manipulation, and other problems. Testing compliance with Benford's Law for authentic digits transfers the detection emphasis from individual transactions to the entire data set from each trader. To apply Eq. 1 as a test for the digit frequencies of a data collection, Benford's Law requires that each dataset entry contain values of comparable phenomena. In Other Words, the data cannot comprise entries from two distinct phenomena: population census records and dental measures. There should be no minimum or maximum values predefined within the data set. In other words, the records for the phenomenon must be complete, with no contrived beginning or end values. The dataset should not contain allocated numbers such as telephone numbers. The data set should have more entries with tiny values than large ones. In real applications, many data sets cannot meet all four criteria mentioned above. Benford's law does not apply to data that does not form organically, including house number, lottery number, telephone number, date, and weight. As shown in the research compiled by Benford in his research paper in 1938.

Although the mathematical proof is unnecessary for this explanation, the law is intuitively straightforward to grasp. Consider a company's market value. If the amount is $1,000,000, it must double in size before the first number becomes a "2," or expand by 100 per cent. For the initial digit to become a "3," it need only increase by 50 per cent. To be a "4", the company only increased by 33%. In many distributions of financial data, which indicate the amount of everything from a purchase order to stock market returns, the digit one is far further from two than the digit eight is from nine.

Consequently, the empirical evidence indicates that smaller values of the first significant digits are significantly more likely than bigger ones for these distributions. Since over ninety years ago, mathematicians and statisticians have provided numerous explanations for this occurrence. Raimi's 1976 article includes a variety of less rigorous theories, ranging from Goudsmit and Furry's notion that the phenomenon is the product of "the way we write numbers" to Furlan's theory that Benford's law reflects a profound "harmonic" reality of nature. Hill, a mathematician, did not produce a proof for Benford's law and illustrate its application to stock market data, census information, and accounting data until 1995. Like the normal distribution, Benford's distribution is an experimentally observable occurrence, he emphasised. Hill's demonstration relies on the fact that the distributions of the numbers in sets that adhere to the Benford distribution are second-generation distributions or composites of other distributions. If distributions are chosen randomly, and samples are drawn randomly from each distribution, then the significant-digit frequencies of the combined samplings will converge to Benford's distribution, even if the individual distributions do not strictly adhere to the law. The secret is in combining numbers from various sources.