Thursday, June 26, 2008


I’m in a rather pensive and introspective mood these days. Such days are usually my most creative days too. I remember doing some artwork when my mood is at the worst, perhaps I’ll share with my shareholders here one day.

I’m thinking a lot about the probability and chances since I’m in the midst of reading the book by Nassim Nicolas Taleb’s The Black Swan. Let me ask these 2 questions:

Question A: 90% of residents living in Tanah Merah are rich. I live in Tanah Merah, so what’s the probability of me being rich?

Question B: 90% of residents living in Tanah Merah are rich. I’m going to live in Tanah Merah, so what’s the probability of me being rich?

(Interesting note: I had to answer this question more often than I had to, so I thought I’ll be quite interesting to think harder about it when I was traveling on the bus today)

In question A, the probability of me being rich is 90%. Since I live in Tanah Merah and I’m considered a resident there, therefore I’m subjected to the sample which the probability of 90% is calculated. In 100 different alternate and parallel realities, I have (on average) 90 realities in which I’ll be rich.

In question B, the probability becomes unknown. If one didn’t think hard enough and gives it a fleeting thought only, it becomes quite tempting to think that the probability of me being rich is 90% too, based on ‘statistical data’. But did you notice that that there is not enough information to determine the probability of me being rich? This is quite different if I’m already living in Tanah Merah. In that case, my probability is 90%. Yet if I’m going to live in Tanah Merah, the probability cannot be determined.

Thus, the probability of past data will be changed when a new comer enters the data base. Yet, the past probability cannot be extended to the new comer.

Do we commit the same logical error when we’re chasing after historical results? Did we place too much faith on extrapolating past data to predict the future?

I can think of more such (hypothetical) questions:

1. A fund manager has 90% chance of making good returns, based on past data. If I’m already vested when the probability of 90% is calculated, I’ve 90% chances of good returns (I mean out of 10 alternate and parallel realities, 9 of the realities I had good returns). But if I’m thinking of investing with this fund manager, do I still have 90% chance of good returns?

2. 9 out of 10 adults developed cancer in their lifetime, or a 90% chance of getting cancer in one’s lifetime, based on historical data. Do I also have 90% chance of getting cancer?

3. From past statistics, 90% of people who cross a road get into accidents. I’m going to cross that road now, will I get a 90% chance of getting into an accident?

4. From my records, 90% of my students get A for mathematics after tutoring them. You’re going to be my student, will you also get 90% chance of getting A for mathematics?

5. From past data, 90% of traders fail to make money. I’m going to be a trader, does it mean I have 90% chance of failing to make money?

Don’t get me wrong, I’m not saying historical data are not important and one shouldn’t take a look at them when trying to predict the future. What I’m saying here is that one shouldn’t treat past data as sacred. The future remains as unpredictable with or without a good track record, hence it’s better to be on the safe side when making prediction. Always prepare yourself for the low probabilistic outlier event (a.k.a. black swans).