Dow Jones and Mathematica
A recent post by economist-blogger Brad DeLong, which was also picked up Matthew Yglesias, mused upon the clustering of the Dow Jones Industrial Average clustered near values starting with 1. He showed a chart with the years 1971–1984, and 1996–2008 circled, when the Dow appeared to fluctuate near 1000 and 10000, respectively. Many commenters quickly jumped to point out that this was an example of Benford’s Law, which says, essentially, that if you’re throwing darts at a logarithmically shaped dartboard, you’re going to hit “1” more often than any other digit. If you pick random values of some phenomenon that is logarithmically distributed, you should get values beginning with “1” about 30% of the time, which makes sense if you’ve ever looked at log scale graph paper.
It occurred to me that this is an easy thing to investigate with Mathematica, much like my earlier post on the Bailout. Mathematica 6 includes access to a huge library of curated data, including historical values of the Dow Jones Industrial average and other indices (and individual stocks, and so forth). The function here is FinancialData
, which Wolfram cautions is experimental: I believe they get the data from the same source as, say, Yahoo! Finance, and just do the conversions to make it automatically importable into Mathematica. That is, it is no more reliable than other web-based archives. The computations are absurdly easy, taking only a few lines of Mathematica code.
The graph I (eventually) produced shows the relative frequencies of first digits that are calculated by Benford’s Law, together with the relative frequencies of the leading digits from the Dow Jones Industrial Average, the S&P 500, the NASDAQ Composite index, the DAX 30, and the Nikkei 225:
It appears that, for both the Dow and the S&P 500, there is a surplus of 1s as leading digits, and also of 8s and 9s. None of the indices particularly follows Benford’s Law, although I’ll leave it to the more statistically inclined to figure out the significance of the variations.
To fetch the historical closing prices, simply tell FinancialData
what date range you wish to look at; in this case, it was All:
dowJonesData = FinancialData["^DJI",All];
Use the semicolon to prevent a 20000 element list from being displayed. (Actually, Mathematica 6 is smart enough to know that one usually doesn’t want to look at lists that long, and would show a shortened version. FinancialData
uses the standard ticker symbols for the indices: ^GSPC
for the S&P 500, ^IXIC
for NASDAQ Composite, ^GDAXI
for the DAX 30, and ^N225
for the Nikkei 225. The function returns a list of date-closing value pairs, such as:
In[]:= dowJonesData[[1]]
Out[]:= {{1928, 10, 1}, 240.01}
I wrote a small function that captures the first digit:
FirstDigit[x_] := RealDigits[x][[1, 1]]
RealDigits
returns two things: a list of the digits, and the number of digits to the left of the decimal point, so the [[1,1]]
element specifier picks out the first element, i.e. the first digit, from the list of digits.
dowJonesDigits = FirstDigit /@ Transpose[dowJonesData][[2]]
Since dowJonesData
is a list of date/value pairs, Transpose
transforms it into lists of dates and values, and the second of those is what we need to work with. The shorthand for Map
is /@
, which performs the function on the left, FirstDigit
, on each element of the list on the right. To tally the number of each leading digit, I use BinCounts
:
dowJonesDigits=Rest[BinCounts[dowJonesDigits]]
dowJonesDigitFractions=N[dowJonesDigits/Length[dowJonesData]]
Since there are no zeroes as leading digits, but BinCounts
starts its tally with zero, I use Rest
to remove the first element of the returned list. The N
is necessary to avoid getting a bunch of (exact) fractions when computing the percentages.
I won’t go into detail about the production of the graph (or the table): this is the type of thing that often trips me up in Mathematica. Although it is true the Mathematica has nearly unlimited flexibility to make graphs, finding the right method to do something that isn’t one of the default options often leads to a frustrating cycle of tweaking, looking at the examples, and experimentation. In this case, getting the legend to come out right was a sticking point.
The same is true with the following table, which I present for anyone wishing to think about the data in more detail. Figuring out the right way to make the number of decimals right for each entry took much longer than it should have.
In any case, here are the individual percentages presented in the graph above, along with the total number of data points for each sample set, and the first day for which FinancialData
data is available.
Index | Benford | Dow Jones | S&P 500 | NASDAQ | DAX 30 | Nikkei 225 |
Start | Oct 1,1928 | 01/03/50 | 02/05/71 | 11/26/90 | 01/04/84 | |
# | 20124 | 14817 | 9536 | 4539 | 6123 | |
1 | 30.1 | 35.42 | 36.3 | 36.71 | 16.59 | 65.31 |
2 | 17.61 | 12.65 | 11.24 | 23.7 | 19.78 | 22.85 |
3 | 12.49 | 6.39 | 6.93 | 10.58 | 11.99 | 6.11 |
4 | 9.69 | 4.78 | 11. | 8.15 | 16.13 | 0 |
5 | 7.92 | 3.72 | 5.8 | 3.24 | 15.71 | 0 |
6 | 6.69 | 5.5 | 4.75 | 3.16 | 11.35 | 0 |
7 | 5.8 | 5.99 | 4.22 | 6.66 | 7.91 | 0.42 |
8 | 5.12 | 13.78 | 7.96 | 3.18 | 0.55 | 2.61 |
9 | 4.58 | 11.78 | 11.8 | 4.62 | 0 | 2.69 |
Update, 20 Nov 2008: fixed Nikkei column in table.
0 comments
Kick things off by filling out the form below.
Leave a Comment