💡 TIL

📈🧐 Bayesian Thinking

February 10, 202210 min read
Image of a chart
Ups and downs

Here's another late-night YouTube video spiral inspired post. I will also have a counter for the number of run-on sentences I tend to have on these posts as a constant reminder of how erratically sprawling my thought process is.

Recently I've been reading about critical thinking and the'scientific' approach to rationality. I've come to realize a lot of the information out there can be boiled down to a few very core concepts. If it's the critical thinking parts, you usually get the whole Greco-roman biases and other "debate-bro" gotchas like Strawmanning, Slippery slope etc. And if it's rationality, then it usually boils down to probabilities like Ockham's Razor.

Ockham's Razor mini-rant coming soon 😤

Another very interesting concept in this realm is the Bayes Theorem. It is a mathematical formula used to determine conditional probability. Aaaand it serves as a good baseline for both critical thinking and scientific rationality.

What's conditional probab- snore 💤

It's the chance of something occuring given something has already happened. In other terms, what's the 'likelihood' of something being true. Bayes Graph pic

Courtesy of the super informative 3B1B video on Bayesian Terms

The formula is hella mathematical and really serves no use to like 90% people just by itself. But like most things in math, it's not the order and the arrangement of symbols that's important; it's what the symbols represent. Here's the formula and here's the most popular accompanying visual.

Bayes Formula pic

This is pretty confusing so I'll start with an example. I took inspiration from the awesome Julia Galef on this one

Let's say you meet a person on a college campus and have a conversation with them. And in this conversation, they come across shy and withdrawn. Now let's say someone asks you "Do you think this person is a Math PhD student or a Business student?".

What would your answer be?

Based on the observation of shy behavior and based on general stereotypes, most people would answer Math PhD. And that's probably a good bet right?
I mean how many Business students do you see acting shy and withdrawn?
You're more likely to see a Doctoral student (an informal pejorative ""nerd"") come across shy, introverted, avoiding eye-contact and such.

But a crucial piece of evidence that's missing here is -

What's the likelihood of the number of students being Math PhDs vs Business students?

If you've suddenly had the realization of "Ah dang Math PhD requires you to be smart but its much easier to get into a Business School", and you are now correcting your initial guess to be a business student.

Be wary.
You are falling into a Bayesian trap. While it is important to consider the number of Math PhD students versus Business students, it also important to know how many of these students can you probably place to be shy and withdrawn?

Let's dump some numbers into this thought experiment

Let's say for every 1 Math PhD student there are 10 Business students. That's a fair ratio right?

You might even put the number of Biz Students higher but for simplicity's sake lets keep it 1:10.

So let's take a sample set or a group of 220 students. Out of them 20 students are Math PhDs and the other 200 are business students.

We can call this bit of information, our prior a.k.a our knowledge beforehand that we should have had before jumping into a guess and making assumptions.

Remember at this point, it is 10 times more likely for the person to be a business student than a Math PhD. Pretty low chances for the Math side right.

Now let's bring in our own judgement based on personal experiences.

Let's say out of the 20 Math PhDs you are to meet, you've seen 75% of them be shy and reserved. And let's say for the business students this percentage is 15%.

If 75% of the 20 Math PhD students come across shy, then we can say that's 15 of them.
Similiarly, if 15% of the 200 Business students come across shy, that's 30 of them.

Now with this piece of information can you intuitively guess what's the likelihood of the person being a Math PhD student?

It's as simple as taking the ratio of 15:30 or 15/30 which ends up being 1/2 or 1:2.

The likelihood itself is 75%:15% :: 1:10 = 75:150 = 1:2

Nearly twice as likely for a student to be a business student than a Math PhD.

Okay..... what's the big deal? The extra knowledge of my own judgement didn't really change the likelihood of him being a business student

It might not have swung it entirely in the other direction. But it moved it just a bit further than what it was "prior" to the new piece of information.

A 1:10 --> 1:2 jump is basically a 5x increase.
But still, its twice as more likely for him to be a business student than a Math PhD

And that is super important to realize that.

  1. You start out with an assumption that the person is more likely to be a Math PhD. Probably overwhelmingly so.
  2. Then I presented you with a piece of information that made you re-think your beliefs and you say "1:10? Those are big numbers. The person is more likely a Biz Student based on those numbers alone"
  3. BUT THEN, a new piece of evidence comes in and changes your belief just that much nudging it to "Hmmm? The overall percentage likelihood that a student in Math and Biz programs are shy has it in favor for Math students... so maybe given the fact that this person is shy....he's a little more likely to be a math student than a business student but again that's not entirely probable"

Voila

This is the science behind modifying our thoughts and judgement based on new pieces of evidence that surfaces. Initially we could say it was an overwhemling chance of the student being a Math PhD student (you could say you were very sure so maybe like 90% sure).

After new evidence of how many students there are AND the student distribution across both programs, that number dropped to 10 times likely.

But then after new evidence of likelihoods of each of these students being shy, it rose to twice as likely.

Did you just try to pass off how humans think biologically as some sort of mindblowing math?

Yes but it is so important to observe how we jumped from 10x to 2x. It came to light that the likelihood for shy students was more for Math PhDs vs Biz students (which was our initial intuition right?). But that only comes when you think about the evidence based on the prior. It did not come from new random study someone comes up to you and tells you about. It came from natural intuition.

And that means, we could formulate how we generally think and prove how our minds can swing based on new information.

The important word here is CAN

Now conversely, think about this.

You go into the same conversation about the person being a Math PhD vs Business Student.

BUT

Someone tells you they've conducted a study and
every single Math PhD student they've come across is shy and withdrawn.

Let's visualize this with numbers again. The total number of students remain the same.
Your likelihood for a math PhD student being shy is now at 100%. And for business students it still is the same at 15%.

The initial numbers now reflect

  • 100% of 20 Math PhD student is 20
  • 15% of 200 Biz students is 30

The ratio is now 20/30 = 2:3
The jump is now from 1:2 previously to 2:3 because of someone's researched study.
That's nearly 50% to 66.67%, a 16.6% increase in the likelihood.

Okay? And?

But let's say we tweak the other side and your prior information was entirely different in the first place. And lets say the evidence remains the same as the original question. 75% of Math PhD students are shy and 15% of Business Students are shy.

You study in Harvard so you are WAY more likely to find business students there than Math PhDs. You disagree with the initial ratio of for every 1 Math PhD student you have 10 business students. It's now for every 1 Math PhD student there are 100 business students. How does this affect our original ratios?
The total number of students we'll take will be 20 Math PhD students + 2000 Business students.

Well based on that alone, our initial ratio is 1/100 or 100 times more likely for a person to be a business student than a Math PhD student

And if we now consider the evidence,

  • 75% of 20 Math PhDs is still 15
  • 15% of 2000 Biz students is 300

The ratio is now 15/300 = 1:20
The jump is now 100 times more likely to 20 times more likely.

And now lets say the same researcher from before comes up to you and tells you the same data, that every Math PhD they've met is shy. We move 75% to 100% for the Shy Math PhDs and 15% of the Shy Business remains.

  • 100% of 20 Math PhDs is 20
  • 15% of 2000 Biz students is 300

The ratio is now 20:300 --> 1:15, or 15 times more likely
The jump is from 20 to 15, which might seem like big enough but in terms of percentages its 5% --> 6.6%

That's an increase of just 1.6%.

Completely. Insignificant.

Even though we are given new evidence telling us every single Math PhD student they've seen has been shy

The likelihood is still only 6.6%.

The converse works as well. If you study in let's say Caltech or MIT where there are WAY more Math PhDs to come across that Biz Students. And someone just presents a study providing data that Biz students are extremely shy. The jump is going to be very insignificant.

Conclusion 😵‍💫

This is ultimately how and why Bayes Theorem is so important in our real world and is applicable to everyone.

If your priors are extremely polarized aka you have very strong inital beliefs, any significant change in your personal opinions cannot be arrived at. Not even with extremely compelling study data as evidence that should suggest otherwise.

  • Imagine getting a diagnosis from Dr. House whose initial prognosis is always Lupus. Now imagine if his core belief is so strong he never listens to his other team of doctors who are chiming in with opinions of what the diagnosis could be when the treatment fails to work.

  • Another example is if a scientist performing research has a hypotheses that he holds true, even if all the evidence points to the opposite. They eventually start to consciously or subconsciously alter data to prove their hypotheses correct. And science overall suffers.

IRL this can be thought of in terms of politics and thought conditioning. You can think of people on the far ends of the political spectrum where their idealogies are so extreme. The beliefs remain unchanged even when studies suggesting otherwise is presented to them.

But, if you are thinking rationally and understand not every decision is black and white, your priors are going be meaningful and have a reflection of the reality we all live.
And this means meaningful evidence can swing opinions in other directions to make a significant difference in how you approach problems.

A thought process that is very binary is very hard to sway in terms of opinions held.

Further reading

  1. On the Importance of Bayesian Thinking in Everyday Life - Towards Data Science
  2. A Visual Guide to Bayesian Thinking - Julia Galef
  3. Bayes Theorem, the geometry of changing beliefs - 3Blue1Brown
BayesianInferenceBayesRuleMathProbabilityStatistics