Example: marketing

Crash Course on Basic Statistics

Crash Course on Basic StatisticsMarina Wahl, of New York at Stony BrookNovember 6, 20132 Contents1 Basic Basic Definitions .. Probability of Events .. Bayes Theorem ..62 Basic Types of Data .. Errors .. Reliability .. Validity .. Probability Distributions .. Population and Samples .. Bias .. Questions on Samples .. Central Tendency ..93 The Normal Distribution114 The Binomial Distribution135 Confidence Intervals156 Hypothesis Testing177 The t-Test198 Regression239 Logistic Regression2510 Other Topics2711 Some Related Questions2934 CONTENTSC hapter 1 Basic Basic DefinitionsTrials?

11 Some Related Questions 29 3. 4 CONTENTS. Chapter 1 Basic Probability 1.1 Basic De nitions Trials? Probability is concerned with the outcome of tri-als.? Trials are also called experiments or observa- ... Systematic sampling: need a list of your pop-ulation and you decide the size of the sample and then compute the number n, which dictates ...

Tags:

  Chapter, Statistics, Sampling, Systematic, Systematic sampling

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Crash Course on Basic Statistics

1 Crash Course on Basic StatisticsMarina Wahl, of New York at Stony BrookNovember 6, 20132 Contents1 Basic Basic Definitions .. Probability of Events .. Bayes Theorem ..62 Basic Types of Data .. Errors .. Reliability .. Validity .. Probability Distributions .. Population and Samples .. Bias .. Questions on Samples .. Central Tendency ..93 The Normal Distribution114 The Binomial Distribution135 Confidence Intervals156 Hypothesis Testing177 The t-Test198 Regression239 Logistic Regression2510 Other Topics2711 Some Related Questions2934 CONTENTSC hapter 1 Basic Basic DefinitionsTrials?

2 Probability is concerned with the outcome also calledexperimentsorobserva-tions(multipl e trials).?Trialsrefers to an event whose outcome is Space (S)?Set ofall possible elementary outcomesofa the trial consists of flipping a coin twice, thesample space isS= (h,h),(h,t),(t,h),(t,t).?The probability of the sample space (E)?Aneventis thespecificationof the outcome ofa consist of asingleoutcome or asetof aneventis everything inthe sample space that is not that event (not Eor E).?The probability of aneventis alwaysbetween0 and probability of aneventand itscomplementis always Events?Theunionof several simple events creates acompound event thatoccurs if one or moreof the events two or more simple eventscreates a compound event that occursonly ifall the simple events events cannot occur together, they aremutu-ally two trials areindependent, the outcome ofone trial does not influence the outcome of all the possible ways ele-ments in a set can be arranged, where theorderis number of permutations of subsets of sizekdrawn from a set of sizenis given by:nPk=n!

3 (n k)!56 chapter 1. Basic PROBABILITYC ombinations?Combinationsare similar to permutations withthe difference that theorder of elements isnot number of combinations of subsets of sizekdrawn from a set of sizenis given by:nPk=n!k!(n k)! Probability of Events?If two events areindependents,P(E|F) =P(E). The probability of both E and F occur-ring is:P(E F) =P(E) P(F)?If two events aremutually exclusive, the prob-ability of eitherEorF:P(E F) =P(E) +P(F)?If the events arenot mutually exclusive(youneed to correct the overlap ):P(E F) =P(E) +P(F) P(E F),whereP(E F) =P(E) P(F|E) Bayes TheoremBayes theorem for any two events:P(A|B) =P(A B)P(B)=P(B|A)P(A)P(B|A)P(A) +P(B| A)P( A)?

4 Frequentist: There are true, fixed parameters in a model(though they may be unknown at times). Data contain random errors which have acertain probability distribution (Gaussianfor example). Mathematical routines analyse the proba-bility of getting certain data, given a par-ticular : There are no true model parameters. In-stead all parameters are treated as randomvariables with probability distributions. Random errors in data have no probabilitydistribution, but rather the model param-eters are random with their own distribu-tions. Mathematical routines analyze probabilityof a model, given some data.

5 The statisti-cian makes a guess (prior distribution) andthen updates that guess with the 2 Basic Types of DataThere two types of measurements:?Quantitative:Discretedata have finite have an infinite numberof (nominal): the possible responsesconsist of a set of categories rather than numbersthat measure an amount of something on a con-tinuous Errors?Random error: due to chance, with no partic-ular pattern and it is assumed to cancel itself outover repeated errors: has an observable pattern,and it is not due to chance, so its causes can beoften ReliabilityHow consistent or repeatablemeasurements are:?

6 Multiple-occasions reliability (test-retest,temporal): how similarly a test perform overrepeated reliability (parallel-forms):how similarly different versions of a test performin measuring the same consistency reliability: how wellthe items that make up instrument (a test) re-flect the same ValidityHow well a test or rating scale measureswhatis supposed to measure:?Content validity: how well the process of mea-surement reflects the important content of thedomain of validity:how well inferencesdrawn from a measurement can be used to pre-dict some other behaviour that is measured atapproximately same validity: the ability to draw infer-ences about some event in the Probability Distributions?

7 Statistical inference relies on making assump-tions about the way data is distributed, trans-forming data to make it fit some known distri-bution probability distributionis de-fined by a formula that specifies what values canbe taken by data points within the distributionand how common each value (or range) will 2. Basic Population and Samples?We rarely have access to the entire population ofusers. Instead we rely on a subset of the popu-lation to use as a proxy for the statisticsestimateunknown popu-lation you should select yoursample ran-domlyfrom the parent population, but in prac-tice this can be verydifficultdue to: issues establishing a truly random selectionscheme, problems getting the selected users to is more important than sampling ?

8 Subject to sampling bias. Conclusions are of lim-ited usefulness in generalizing to a larger popu-lation: Volunteersamples. Convenience samples: collect informa-tion in the early stages of a study. Quota sampling : the data collector isinstructed to get response from a certainnumber of subjects within sampling ?Every member of the population has a knowprobability to be selected for the simplest type is asimple random sam-pling(SRS).? systematic sampling : need a list of your pop-ulation and you decide the size of the sample andthen compute the numbern, which dictates howyou will select the sample: Calculatenby dividing the size of the pop-ulation by the number of subjects you wantin the sample.

9 Useful when the populationaccrues overtimeand there isno predetermined listof population members. One caution: making sure data is not sample: the population of interestis divided into non overlapping groups orstratabased on common sample: population is sampled by us-ing pre-existing groups. It can be combined withthe technique of sampling proportional to Bias?Sample needs to be a good representation of thestudy the sample is biased, it is not representativeof the study population, conclusions draw fromthe study sample might not apply to the statistic used to estimate a parameter isun-biasedif the expected value of its sampling dis-tribution is equal to the value of the parameterbeing is a source of systematic error and enterstudies in two primary ways: During theselection and retentionofthe subjects of study.

10 In the wayinformation is collectedabout the Selection Bias?Selection bias: if some potential subjects aremore likely than others to be selected for thestudy sample. The sample is selected in a waythat systematically excludes part of the CENTRAL TENDENCY9?Volunteer bias: the fact that people who vol-unteer to be in the studies are usually not rep-resentative of the population as a bias: the other side of volunteerbias. Just as people who volunteer to take partin a study are likely to differ systematically fromthose who do not, so people who decline to par-ticipate in a study when invited to do so verylikely differ from those who consent to censoring: can create bias in anylongitudinal study (a study in which subjects arefollowed over a period of time).


Related search queries