Hello readers, In this article I will be explaining the P-value(probability value) concept with a basic example. ✊
Before knowing about P-value I was always afraid of this. It is related to statistics and as I haven’t done stats from the beginning I might not get it. I am very interested in Machine learning and there are so many places where I should know what P-value is. It plays a vital role in Machine learning. If I don’t know this then How am I going to train and test my models? 😟
Let’s see an example in layman language (as it is easy for beginners ) then we will come to some fancy terminologies.
Suppose I have a coin tossed once I get heads. I toss that again and get heads. Wowww... My coin is a special coin, it showed me heads two times in a row. Do you think it’s a special coin?
No! but why? Do you have a strong reason to prove it No?
We use P-value to support our answer or hypothesis in terms of stats. we will see the calculations at the end for the same example.
I say:- My coin is a special coin.
Stats says:- Your coin is no more than a normal coin. 😏
In terms of stats, “My coin is no more than a normal coin” is a null hypothesis. In my words, the Null hypothesis is the opposite of what you are trying to prove.
A null hypothesis is a theory that assumes there is no statistical importance between the two variables in the hypothesis. It is the assumption that the researcher is seeking to expose. For example, there is no statistically meaningful relationship between the type of water fed to the plants and growth of the plants. A researcher is questioned by the null hypothesis and normally wants to deny it, to illustrate that there is a statistically vital relationship between the two variables in the hypothesis. 🤯— (Learn more)
P-value is used to reject the null hypothesis. If we reject “My coin is no more than a normal coin” (null hypothesis) then this means our coin is a special coin. If we fail to reject then our coin is a normal coin.
So, How we reject? 🤔
We take a threshold P-value. If the resultant p-value of our question is less than the threshold only then we reject the hypothesis otherwise not.
Generally, we take 0.05 as a threshold but we can take any value depending upon the importance of that problem. 0.05 means 5 in 100. Refer this for more-( click here )
If the p-value is :
Reject the null hypothesis < 0.05 < Don’t reject the null hypothesis
Okay, let’s move ahead, 🚆
See the images, here all the possible outcomes with the probability of tossing a coin twice are listed.
P-value is composed of three parts:
- The probability random chance would result in the observation
- The probability of observing something else that is equally rare -say b
- The probability of observing something rarer or extreme
We consider part 2 and 3 because having something equally important ( getting two tails in a row in this particular example is same as getting two heads ) and more extreme decreases the chance of being special and rare.
p-value for two heads = a + b + c
= 0.25+ 0.25+ 0
0.5 is not less than 0.05 hence it will not reject the null hypothesis and we conclude that our coin is no more than a normal coin.
👉This is the example from StatQuest from where I have understood it and thought to share it with you all. This is how a community grows 😉.
I personally recommend you to see these videos from StatQuest there are more good examples🧐
Please tell me if there is anything wrong with the above article
Any suggestions are accepted. I will recheck it and amend it.😇
Let’s connect and share our knowledge and experiences.🤝
Enjoy and Keep learning…