P values and Hypothesis testing are common and important terms in statistics, but they can be confusing to understand. In this post, I will explain what P values are and how to interpret them in statistics with real-life examples.
If you are working or learning data science, understanding P values is essential for accurate statistical analysis. Before learning the P value, you must understand what is hypothesis testing in statistics.
To understand the P value in statistics, you should first understand what hypothesis testing is. I will highly recommend you to read this post where I explained Hypothesis testing in detail.
What are P Values in Statistics?
The full form of P value is probability of values. P value is a statistical measure used to determine the likelihood to obtain a particular result. Essentially we want to figure out the line which can tell us whether we reject the null hypothesis or fail to reject it.
The P value is a number between 0 to 1, representing the probability or confidence for a result (statistical test). A low P value (close to zero) indicates that the observed result is not occurred by chance so we can reject the null hypothesis.
Understand P value with an example
Now let me explain what is p value in statistics with a simple example. I will extend the same real-life example which I used in my previous post about Hypothesis testing.
Let me refresh you the problem in short. Let’s say there is a candy manufacturing company. The candy machine of that company produces chocolate bars with an average weight of 5 grams.
Now suddenly a worker has made a claim that the candy machine no longer produces chocolate bars with weight of 5 grams each.
Now, this is a big claim for a manufacturing company. But problem is that you can not blindly agree or disagree with the worker’s claim. We need to understand how likely is that I would accept that worker’s claim or reject that worker’s claim.
This is where p values (part of hypothesis testing) can help us to take the right decision statistically.
Before calculating p-value, we need to do hypothesis testing. If you want to understand hypothesis testing with the same example read this post. In this post, I am not going to explain hypothesis testing and various other statistical terms.
We will do our hypothesis testing hoping to see if there is enough evidence to reject the null hypothesis.
So under the null hypothesis, we are expecting the candy machine is currently producing each chocolate bar with 5g of weight.
How to calculate p values in statistics
The way we calculate hypothesis testing results. In the same way, we calculate p-value. Now to calculate p value we need to have some sample data. I will use the same sample data which I used in the hypothesis testing post. The sample data looks like below:
I will show you how you can calculate p values manually. You can do it using any software like Python, Excel, R, etc.
Manually calculate p-value
To calculate p value manually you first need to calculate hypothesis testing using either T test or Z test. Since the population mean is unknown for this example we are working on, we need to use T-test to calculate the hypothesis.
Check my previous post how we calculated t score and degree of freedom for this example. The calculated T score was -0.74074 and the degree of freedom is 9.
Once we get the T score and degree of freedom, we need to see the probability table for T-test or T distribution table, which I also explained in my previous post (hypothesis testing). There is no such formula for p-value calculation in statistics. We simply need to follow the T distribution table to get or determine the p-value.
T distribution table
In the T distribution table, we need to see the corresponding p-value either from One tail row or from two tail row. Now our T score is -0.74074 in positive 0.74074 which is near to 0.703 in the above T table.
Since our example is One Tail so corresponding p-value is 0.25.
Now the question is based on our calculated p value should we reject the null hypothesis? The answer is no. A good p-value should be less than or equal to 0.05. So we failed to reject the null hypothesis. Since 0.25 > 0.05, therefore we failed to reject the null hypothesis.
That means the claim by the worker is wrong. The machine is working properly. It is still making candy bars of 5 grams each.
Note: If you face any issues understanding the above calculation means your concept of Hypothesis testing is not clear. In that case, I will suggest you to read this post to brush up your knowledge on Hypothesis testing.
Difference between one-tailed and two-tailed P value
One-tailed and two-tailed p-values are the way in which the probability of the test statistic is calculated. This is the reason they have different interpretations.
A one-tailed p-value looks at only one end of a probability distribution, while a two-tailed p-value looks at both ends. One-tailed tests are more powerful but carry a greater risk of error, while two-tailed tests are more conservative but less powerful. Our example is a one-tailed test.
While exploring the topic about what p-values and hypothesis testing are in statistics, you may have some questions in your mind. Let me explain those in this FAQ section.
Difference between Rejectiong region and P value
In hypothesis testing, the rejection region and p-value are two different methods used to determine whether to reject or fail to reject the null hypothesis.
The rejection region is a fixed range of values used to reject or fail to reject the null hypothesis. On the other hand, the p-value is a probability that measures the strength of evidence against the null hypothesis based on the observed data. In general, p-value is mostly accepted parameter for hypothesis testing.
How does the sample size affect the P value?
The sample size can have a significant effect on the p-value in hypothesis testing. In general, larger sample sizes tend to result smaller p-values, while smaller sample sizes tend to result larger p-values.
This is because as the sample size increases, the test statistic becomes more accurate and the standard error decreases. As a result, the p-value, which measures the strength of the evidence against the null hypothesis, becomes smaller, indicating stronger evidence against the null hypothesis.
In general, a larger sample size is preferred. As it can produce more accurate test statistics and p-values.
What is the significance level of a P value?
The significance level of a p-value is used to make a decision about rejecting or failing to reject the null hypothesis.
Typically and mostly, a good significance level is set at 0.05, meaning that if the p-value is less than or equal to 0.05, the null hypothesis is rejected. And if the p-value is greater than 0.05, the null hypothesis is not rejected.
Number of samples required to calculate p-value
There is no fixed number of samples. A larger sample size is generally preferred as it provides more reliable result and increases the power of the test.
What does P-Value mean in Regression?
In regression analysis, the p-value is used to measure the statistical significance or the relationship between predictor variables and the response variable.
A small p-value (e.g. <0.05) indicates strong evidence of a significant relationship between the two variables. While a large p-value (e.g.: >0.05) suggests weak or no evidence of relationship between two variables.
P value can be used to find out important variables or feature selection in regression analysis.
What are some alternative methods to P values for statistical analysis?
There are some alternative methods of P-value like confidence intervals, Bayesian methods, effect sizes, resampling methods, model selection criteria, etc. which can provide additional information and help avoid some of the limitations of p-values.
But p-value is always preferred parameter while doing any test statistics.
Hi there, I’m Anindya Naskar, Data Science Engineer. I created this website to show you what I believe is the best possible way to get your start in the field of Data Science.