Example: air traffic controller

3 Basics of Bayesian Statistics - Carnegie Mellon University

3 Basics of Bayesian StatisticsSuppose a woman believes she may be pregnant after a single sexual encounter,but she is unsure. So, she takes a pregnancy test that is known to be 90%accurate meaning it gives positive results to positive cases 90% of the time and the test produces a positive , she would like to know theprobability she is pregnant, given a positive test (p(preg|test +)); however,what she knows is the probability of obtaining a positive test resultif she ispregnant (p(test +|preg)), and she knows the result of the a similar type of problem, suppose a 30-year-old man has a positiveblood test for a prostate cancer marker (PSA). Assume this test is alsoap-proximately 90% accurate.

3 Basics of Bayesian Statistics Suppose a woman believes she may be pregnant after a single sexual encounter, but she is unsure. So, she takes a pregnancy test that is known to be 90%

Tags:

  University, Carnegie, Carnegie mellon university, Mellon

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of 3 Basics of Bayesian Statistics - Carnegie Mellon University

1 3 Basics of Bayesian StatisticsSuppose a woman believes she may be pregnant after a single sexual encounter,but she is unsure. So, she takes a pregnancy test that is known to be 90%accurate meaning it gives positive results to positive cases 90% of the time and the test produces a positive , she would like to know theprobability she is pregnant, given a positive test (p(preg|test +)); however,what she knows is the probability of obtaining a positive test resultif she ispregnant (p(test +|preg)), and she knows the result of the a similar type of problem, suppose a 30-year-old man has a positiveblood test for a prostate cancer marker (PSA). Assume this test is alsoap-proximately 90% accurate.

2 Once again, in this situation, the individualwouldlike to know the probability that he has prostate cancer, given the positivetest, but the information at hand is simply the probability of testingpositiveif he has prostate cancer, coupled with the knowledge that he tested Theorem offers a way to reverse conditional probabilities and,hence, provides a way to answer these questions. In this chapter, I first showhow Bayes Theorem can be applied to answer these questions, but then Iexpand the discussion to show how the theorem can be applied to probabilitydistributions to answer the type of questions that social scientists commonlyask. For that, I return to the polling data described in the Bayes Theorem for point probabilitiesBayes original theorem applied to point probabilities.

3 The basic theoremstates simply:p(B|A) =p(A|B)p(B)p(A).( )1In fact, most pregnancy tests today have a higher accuracy rate, but the accuracyrate depends on the proper use of the test as well as other Basics of Bayesian StatisticsIn English, the theorem says that a conditional probability for eventBgiven eventAis equal to the conditional probability of eventAgiven eventB, multiplied by the marginal probability for eventBand divided by themarginal probability for : From the probability rules introduced in Chapter 2, we know thatp(A,B) =p(A|B)p(B). Similarly, we can state thatp(B,A) =p(B|A)p(A).Obviously,p(A,B) =p(B,A), so we can set the right sides of each of theseequations equal to each other to obtain:p(B|A)p(A) =p(A|B)p(B).

4 Dividing both sides byp(A) leaves us with Equation left side of Equation is the conditional probability in which weare interested, whereas the right side consists of three (A|B)is the conditional probability we are interested in (B) is the un-conditional (marginal) probability of the event of interest. Finally,p(A) is themarginal probability of eventA. This quantity is computed as the sum ofthe conditional probability ofAunder all possible eventsBiin the samplespace: Either the woman is pregnant or she is not. Stated mathematically fora discrete sample space:p(A) =XBi SBp(A|Bi)p(Bi).Returning to the pregnancy example to make the theorem more concrete,suppose that, in addition to the 90% accuracy rate, we also know that thetest gives false-positive results 50% of the time.

5 In other words, in cases inwhich a woman isnotpregnant, she will test positive 50% of the time. Thus,there are two possible eventsBi:B1= preg andB2= not preg. Additionally,given the accuracy and false-positive rates, we know the conditional probabil-ities of obtaining a positive test under these events:p(test +|preg) =.9 andp(test +|not preg) =.5. With this information, combined with some prior information concerning the probability of becoming pregnant from a singlesexual encounter, Bayes theorem provides a prescription for determining theprobability of prior information we need,p(B) p(preg), is the marginal probabil-ity of being pregnant, not knowing anything beyond the fact that the womanhas had a single sexual encounter.

6 This information is considered priorinfor-mation, because it is relevant information that exists prior to the test. We mayknow from previous research that, without any additional information ( ,concerning date of last menstrual cycle), the probability of conception for anysingle sexual encounter is approximately 15%. (In a similar fashion, concerningthe prostate cancer scenario, we may know that the prostate cancer incidencerate for 30-year-olds is .00001 see Exercises). With this information, we candeterminep(B|A) p(preg|test +) Bayes Theorem for point probabilities49p(preg|test +) =p(test +|preg)p(preg)p(test +|preg)p(preg) +p(test +|not preg)p(not preg).

7 Filling in the known information yields:p(preg|test +) =(.90)(.15)(.90)(.15) + (.50)(.85)=. +.425=. , the probability the woman is pregnant, given the positive test,is Using Bayesian terminology, this probability is called a posterior prob-ability, because it is the estimated probability of being pregnant obtainedafterobserving the data (the positive test). The posterior probability is quitesmall, which is surprising, given a test with so-called 90% accuracy. How-ever, a few things affect this probability. First is the relativelylow probabilityof becoming pregnant from a single sexual encounter (.15). Second is the ex-tremely high probability of a false-positive test (.)

8 50), especially given the highprobability of not becoming pregnant from a single sexual encounter (p=.85)(see Exercises).If the woman is aware of the test s limitations, she may choose to repeat thetest. Now, she can use the updated probability of being pregnant (p=.241)as the newp(B); that is, the prior probability for being pregnant has now beenupdated to reflect the results of the first test. If she repeats thetest and againobserves a positive result, her new posterior probability of beingpregnantis:p(preg|test +) =(.90)(.241)(.90)(.241) + (.50)(.759)=. +.425=. result is still not very convincing evidence that she is pregnant, but if sherepeats the test again and finds a positive result, her probability increases (for general interest, subsequent positive tests yield the following prob-abilities: test 4 =.

9 649, test 5 = .769, test 6 = .857, test 7 = .915, test 8 =.951, test 9 = .972, test 10 = .984).This process of repeating the test and recomputing the probability of in-terest is the basic process of concern in Bayesian Statistics . Froma Bayesianperspective, we begin with some prior probability for some event, andwe up-date this prior probability with new information to obtain a posteriorprob-ability. The posterior probability can then be used as a prior probability ina subsequent analysis. From a Bayesian point of view, this is an appropriatestrategy for conducting scientific research: We continue to gather data to eval-uate a particular scientific hypothesis; we do not begin anew (ignorant) eachtime we attempt to answer a hypothesis, because previous research providesus witha prioriinformation concerning the merit of the Basics of Bayesian Bayes Theorem applied to probability distributionsBayes theorem, and indeed, its repeated application in cases suchas the ex-ample above, is beyond mathematical dispute.

10 However, Bayesian statisticstypically involves using probabilitydistributionsrather than point probabili-ties for the quantities in the theorem. In the pregnancy example, weassumedthe prior probability for pregnancy was a known quantity of exactly .15. How-ever, it is unreasonable to believe that this probability of .15 is in fact thisprecise. A cursory glance at various websites, for example, reveals awide rangefor this probability, depending on a woman s age, the date of her last men-strual cycle, her use of contraception, etc. Perhaps even more importantly,even if these factors were not relevant in determining the prior probabilityfor being pregnant, our knowledge of this prior probability is not likely to beperfect because it is simply derived from previous samples and is not a knownand fixed population quantity (which is precisely why different sources maygive different estimates of this prior probability!)


Related search queries