Example: quiz answers

Probability Theory and Statistics

Probability Theory andStatisticsWith a view towards the natural sciencesLecture notesNiels Richard HansenDepartment of Mathematical SciencesUniversity of CopenhagenNovember 20102 PrefaceThe present lecture notes have been developed over the last couple of years for acourse aimed primarily at the students taking a Master s in bioinformatics at theUniversity of Copenhagen. There is an increasing demand fora general introductorystatistics course at the Master s level at the university, and the course has alsobecome a compulsory course for the Master s in eScience. Both educations emphasizea computational and data oriented approach to science in particular the aim of the notes is to combine the mathematical and theoretical underpinningof Statistics and statistical data analysis with computational methodology and prac-tical applications.

ity theory as the foundation for doing statistics. The probability theory will provide a framework, where it becomes possible to clearly formulate our statistical questions and to clearly express the assumptions upon which the answers rest.

Tags:

  Statistical, Theory, Probability, Probability theory

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Probability Theory and Statistics

1 Probability Theory andStatisticsWith a view towards the natural sciencesLecture notesNiels Richard HansenDepartment of Mathematical SciencesUniversity of CopenhagenNovember 20102 PrefaceThe present lecture notes have been developed over the last couple of years for acourse aimed primarily at the students taking a Master s in bioinformatics at theUniversity of Copenhagen. There is an increasing demand fora general introductorystatistics course at the Master s level at the university, and the course has alsobecome a compulsory course for the Master s in eScience. Both educations emphasizea computational and data oriented approach to science in particular the aim of the notes is to combine the mathematical and theoretical underpinningof Statistics and statistical data analysis with computational methodology and prac-tical applications.

2 Hopefully the notes pave the way for an understanding of thefoundation of data analysis with a focus on the probabilistic model and the method-ology that we can develop from this point of view. In a single course there is nohope that we can present all models and all relevant methods that the students willneed in the future, and for this reason we develop general ideas so that new modelsand methods can be more easily approached by students after the course. We can,on the other hand, not develop the Theory without a number of good examples toillustrate its use. Due to the history of the course most examples in the notes arebiological of nature but span a range of different areas from molecular biology andbiological sequence analysis over molecular evolution andgenetics to toxicology andvarious assay who take the course are expected to become users of statistical methodologyin a subject matter field and potentially also developers of models and methodologyin such a field.

3 It is therefore intentional that we focus on the fundamental principlesand develop these principles that by nature are mathematical. Advanced mathemat-ics is, however, kept out of the main text. Instead a number ofmath boxes canbe found in the notes. Relevant, but mathematically more sophisticated, issues aretreated in these math boxes. The main text does not depend on results developed iniiithe math boxes, but the interested and capable reader may findthem formal mathematical prerequisites for reading the notes is a standard calculuscourse in addition to a few useful mathematical facts collected in an appendix. Thereader who is not so accustomed to the symbolic language of mathematics may,however, find the material challenging to begin fully benefit from the notes it is also necessary to obtain and install the statisti-cal computing environment R.

4 It is evident that almost all applications of statisticstoday require the use of computers for computations and veryoften also simula-tions. The program R is a free, full fledge programming language and should beregarded as such. Previous experience with programming is thus beneficial but notnecessary. R is a language developed for statistical data analysis and it comes witha huge number of packages, which makes it a convenient framework for handlingmost standard statistical analyses, for implementing novel statistical procedures, fordoing simulation studies, and last but not least it does a fairly good job at producinghigh quality all have to crawl before we can walk let alone run.

5 We beginthe notes withthe simplest models but develop a sustainable Theory that can embrace the moreadvanced ones , but not least, I owe a special thank to Jessica Kasza fordetailed comments onan earlier version of the notes and for correcting a number ofgrammatical 2010 Niels Richard HansenContents1 Notion of probabilities .. Statistics and statistical models .. 42 Probability Introduction .. Sample spaces .. Probability measures .. Probability measures on discrete sets .. Descriptive methods .. Mean and variance .. Probability measures on the real line .. Descriptive methods .. Histograms and kernel density estimation.

6 Mean and variance .. Quantiles .. Conditional probabilities and independence .. Random variables .. Transformations of random variables .. Joint distributions, conditional distributions and independence .. Random variables and independence .. Random variables and conditional distributions .. Transformations of independent variables .. Simulations .. Local alignment - a case study .. Multivariate distributions .. Conditional distributions and conditional densities .. Descriptive methods .. Transition probabilities .. 1113 statistical models and statistical Modeling .. Classical sampling distributions.

7 statistical Inference .. Parametric statistical Models .. Estimators and Estimates .. Maximum Likelihood Estimation .. Hypothesis testing .. Two samplet-test .. Likelihood ratio tests .. Multiple testing .. Confidence intervals .. Parameters of interest .. Regression .. Ordinary linear regression .. Non-linear regression .. Bootstrapping .. The empirical measure and non-parametric bootstrapping .. The percentile method .. 2164 Mean and Expectations .. The empirical mean .. More on expectations .. Variance .. Multivariate Distributions .. Properties of the Empirical Approximations.

8 Monte Carlo Integration .. Asymptotic Theory .. MLE and Asymptotic Theory .. Entropy .. 260A Obtaining and running R .. Manuals, FAQs and online help .. The R language, functions and scripts .. Functions, expression evaluation, and objects .. Writing functions and scripts .. Graphics .. Packages .. Bioconductor .. Literature .. Other resources .. 274B Sets .. Combinatorics .. Limits and infinite sums .. Integration .. Gamma and beta integrals .. Multiple integrals .. Notion of probabilitiesFlipping coins and throwing dice are two commonly occurringexamples in an in-troductory course on Probability Theory and Statistics .

9 They represent archetypicalexperiments where the outcome is uncertain no matter how many times we rollthe dice we are unable to predict the outcome of the next use probabilitiesto describe the uncertainty; a fair, classical dice has Probability 1/6 for each side toturn computations can to some extent be handled basedon intuition, common sense and high school mathematics. In the popular dice gameYahtzee the Probability of getting a Yahtzee (five of a kind) in a single throw is forinstance665=164= argument for this and many similar computations is basedon thepseudo theoremthat the Probability for any event equalsnumber of favourable outcomesnumber of possible a Yahtzee consists of the six favorable outcomes with all five dice facing thesame side upwards.

10 We call the formula above a pseudo theorembecause, as we willshow in Section , it is only the correct way of assigning probabilities to eventsunder a very special assumption about the probabilities describing our special assumption is that all outcomes are equally probable something wetend to believe if we don t know any better, or can see no way that one outcomeshould be more likely than , without some training most people will either get it wrong or have to giveup if they try computing the Probability of anything except the most elementary12 Introductionevents even when the pseudo theorem applies. There exist numerous tricky prob-ability questions where intuition somehow breaks down and wrong conclusions canbe drawn if one is not extremely careful.


Related search queries