Would you take this bet? A friend is engaged in a game of dice at a casino - if the roll is even, he loses $10, otherwise, he wins $10. In a day he manages to get through 100 dice rolls. Every time he loses he says “My luck will turn, my odds of winning next roll is better than 50%”. You recognize this as a classic case of “Gambler’s Fallacy” - that each roll of the dice is independent, so previous outcomes do not affect the next outcome, and explain this to your friend. The friend does not believe you, but is scientifically minded and proposes you an honest bet to settle the matter - on the day we count the total number of times he is wrong, and how many times he is right. At the end of the day if he is right more times than wrong you will give him $100, and if it is the opposite he gives you $100, and if it’s a draw, no money exchanges hands. Would you accept his bet, reject it, or not mind either way?
The dice rolls are independent, and as a matter of practice, we know that casinos don’t lose money. So intuition would suggest we should accept his bet and make some easy money? To answer the question, let us consider two following bets:
- We roll a fair dice 100 times, if we see an “even” number we take note of the next roll. If the roll is “odd” I get a coin from you, if it’s “even” I give you a coin. Should you take the bet, reject or not mind either way?
- We play like (1), however instead of exchanging a coin we keep a score for the duration of the game - I get a point if given an “even” roll next one is “odd”, while you get a point if an “even” is followed by “even”. Should you take the bet, reject or not mind either way?
The first bet corresponds to our friend’s bet with the casino, and the second case corresponds to his bet with us. To see what we should do, let’s consider a shorter case, a game whose duration is just 3 rolls, and build out every possible situation and count outcomes in a table format.
The first column enumerates each possible sequence in the game we could get, with roll order displayed in order from left to right. All sequences are equally likely as the dice roll is random. The second column counts the number of times an E is followed by E - which is e.g. 2 times for the first sequence. The third column counts the times E is followed by an O in sequence. Finally, the final column highlights which of the two previous columns is larger, from where we can finally see if we should take the bet (2) above and if our Galmber’s friend is right or wrong about the “Galmbers fallacy” in the original bet. To refresh - he is betting on his “luck turning” aka E followed by O as he loses to the casino if it is E. Reading the result from the final column of the table we see that our gambling friend will win 3 out of 5 times (while you 2 out of 5), thus on average making $20 dollars per game from you… wait, what?!
Is something wrong with “Gambler’s fallacy”? If there is, he could make money from the casino. Instead of betting on every roll of dice, what if he waits until he sees an E and then bets that the next roll will be O - will he make money? The answer can also be read from the table - the 2nd column enumerates the amount of money given to the casino, and the 3rd column enumerates how much money is given to our friend by the casino. If we subtract the columns - then we see that on average no money changes hands as each sequence is equally likely. Hmm, reassuring, but odd.
Thus the Gambler’s Paradox - how can our friend make no money from the casino but make from our “is Gambler’s fallacy real” bet? Basically, it is important to note that while both situations look similar they are clearly (by now) not equivalent - in (2) we are considering the “probability of E given prior E and a given sequence”, while in (1) we are considering “probability of E given prior E across all sequences”. One might intuitively guess that they are equal (or that the average is), but turns out that no. Intuition sometimes wacks us.
The issue can creep up as an error in data aggregation. For example, we are given hourly data for a stock price for each day, and we might want to predict stock movements for trading and thus calculate “will the stock price go up in the next hour given it went up in the prior hour” for each day and then average over days. Does not seem an obviously problematic procedure, but in reality, we made a slip-up. Such errors happen in practice [1]. In behavioral research, researchers investigated if there is such a thing as a “hot hand” - if a basketball player scores, the odds of scoring the next one increase. To find that out they had a set of players located at spots where they scored 1/2 of the time and each player would make 100 shots. Then for each player, the analysis would check if the odds of scoring went up following a “streak”, and if was still 1/2 then the conclusion would be that it did not. But as we just saw - the odds of scoring given a streak for a given player is actually not 1/2. Whoops.
Probabilities are fun, but it is like the ocean - one needs to be careful out there.
[1] J. Miller and A. Sanjurjo, 2018