How to Calculate p

Author: Peter Berry
Date Of Creation: 20 February 2021
Update Date: 15 May 2024
Anonim
How to calculate p-values
Video: How to calculate p-values

Content

P-value is a statistical quantity that helps scientists decide whether their hypotheses are true or false. P-values ​​are used to determine whether the test result is within the normal range of values ​​for the cases observed. If the P-value of a data series is lower than a specific pre-intended value (such as 0.05), scientists will reject the "null hypothesis" of the experiment - in other words. otherwise, they will abandon the hypothesis that the experiment variables Not available real impact on the results. Nowadays, p-values ​​are often found in a reference table by calculating numerical values chi squared (chi square).

Steps

  1. Determine the results expected of the experiment. Usually, when scientists run an experiment and observe the results, they anticipate what a "normal" or "typical" result will look like. This can be based on results of experiments performed, reliable series of follow-up data, scientific literature, and / or other sources. For your experiment, define expected results and represent them in numbers.
    • For example: According to previous studies, red cars are more likely to get tickets for speeding than blue cars nationwide. Let's say that the average result is 2: 1 in favor of red cars. We want to know if the city police are also showing this trend by analyzing the number of speedy tickets they give. If we randomly sample 150 speed tickets for both red and green cars in the city, we estimate 100 for red cars and 50 for green cars if city police are forced to penalize on a national scale.

  2. Determine the experimental results observed. Now that you have the expected value, run the experiment and find the actual (or "observable") value. Express those results in numbers. If we impact on experimental conditions and actual results other Compared to the expected result, two possibilities are possible: either by chance, or by manipulating experimental variables. lead to that difference.The purpose of finding p-values ​​is essentially to determine whether the observed results differ from the expected results enough to reject the "null hypothesis" - the hypothesis that there is no relationship between variables. number of experiments and results observed.
    • Example: Suppose, in the city, we randomly select 150 ticket tickets for red and green cars. I found out 90 ticket ticket and red car ticket 60 green car tickets. These numbers differ from the corresponding expected results 100 and 50. Did our influence in the experiment (in this case change the source of data from national to local) lead to a change in the results, or the city police? also tend to be as the national average shows, and are we observing random variation? The p-value will help us make a decision in this case.

  3. Determined degrees of freedom in the experiment. The degree of freedom is a measure of the degree of variation in the study, determined by the number of groups you examine. The degree of freedom expression is written like this: Degree of freedom = n-1, where "n" is the number of groups or variables analyzed in the experiment.
    • Example: The experiment has two groups of results: one for red car and one for green car. So, in this experiment, we have 2-1 = 1 degree of freedom. If we compare red cars, blue cars and green cars, we will 2 degrees of freedom, etc.

  4. Use numbers chi squared to compare expected results and actual results. Chi squared (written as "x") is a numerical value that measures the difference between the values expected and results observable. The equation for calculating chi square is as follows: x = Σ ((o-e) / e), where "o" is the observed value and "e" is the expected value. Add up the answers to the equation in all possible outcomes (see below).
    • Note that this equation includes an operator Σ (sigma). In other words, you will have to calculate ((| o-e | -0.05) / e) for each possible possibility - either a red car or a green car that receives a ticket. So we will compute ((o-e) / e) twice - one for the red car and one for the green car.
    • Example: Attach the expected and observed values ​​to the equation x = Σ ((o-e) / e). Remember that since there is the sigma operator, we need to compute ((o-e) / e) twice - one for the red car and one for the green car. The calculation is as follows:
      • x = ((90-100) / 100) + (60-50) / 50)
      • x = ((-10) / 100) + (10) / 50)
      • x = (100/100) + (100/50) = 1 + 2 = 3 .
  5. Choose statistical significance level. Now that we have our degrees of freedom and chi-squared values ​​for the experiment, the last thing to do before finding the p value is to determine the level of statistical significance. Basically, the significance level is a measure of the certainty of the outcome - low statistical significance corresponds to a low probability of random outcome of the experiment, and vice versa. Significance levels are written in decimal (such as 0.01), which corresponds to the percentage of randomly obtained results (in this case 1%).
    • As usual, the scientists took the statistical significance of the experiment as 0.05, or 5 percent. This means that test results that meet the statistical significance have a 5% chance of being completely randomized. In other words, there is a 95% more chance that the results would be attributed to the scientist's influence on the variables in the experiment than by chance. For most experiments, it is considered "successful" to be 95% sure of the relationship between the two variables.
    • For example: In the experiments of red cars and green cars, let's follow the scientific practice and take the significance level as 0,05.
  6. Use the chi square distribution table to calculate the p value. Scientists and mathematicians use tables with many chi-squared values ​​to calculate the p-number for their experiments. These tables are usually created with the vertical axis to the left corresponding to the degrees of freedom, and the horizontal axis above corresponding to the p value. Use these tables by finding the degrees of freedom first, then reading the lines left to right until you find the first value. bigger chi-squared value. Look at the corresponding p-value at the top of the column - the p-value is within that range and the next largest (the value is to the left adjacent).
    • There are a variety of sources to refer to chi-squared distribution tables - you can easily find them online or in science and statistics textbooks. If not available use the table in the picture above or free online, like the one on the website: medcalc.org here.
    • Example: The chi-squared value is 3. So, use the chi-squared distribution table in the image above to find an approximate p-value. We already know that experiments have degrees of freedom 1Let's start from the first row. Going from left to right for that row we find a higher value 3 - value of chi squared. The first value we encounter is 3.84. Looking at the top of the column, we see that the corresponding p-value is 0.05. That means the p-value will range from 0.05 to 0.1 (next largest p value in the table).
  7. Decide whether to keep or reject your null hypothesis. Now that you've found an approximate p-value for your experiment, you can now decide to reject or accept the null hypothesis of the experiment (remember, this is the hypothesis of experimental variables. you affect are not affects the observed results). If the p-value is lower than statistically significant, congratulations - you have demonstrated a high probability of a relationship between the variables you influence and the observed outcomes. If the p-value is higher than statistically significant, you cannot be sure whether the actual result will be due to total randomness or if you acted in the experiment.
    • Example: The p-value is between 0.05 and 0.1. This means totally worth it are not is less than 0.05, so unfortunately ta the null hypothesis cannot be rejected. That means we have not reached the 95% minimum threshold to be able to assert that city police issue tickets for red and green cars at a significantly different rate from the national average.
    • In other words, there is a 5-10% chance that the observed result is not due to location change (analyzing city data instead of national data), but just randomness. Since we are looking for a possibility of less than 5%, it cannot be confirmed that we are know for sure City police pay less attention to red cars - although less, there is still a statistically significant ability to show they are not doing so.
    advertisement

Advice

  • Pocket calculators can do the math a lot faster. You can use a calculator online.
  • You can calculate p-values ​​with computer programs, such as regular spreadsheet software or more specialized statistical software.