Transcription of Introduction to Estimation
1 Introduction to EstimationOPRE 6301 Statistical Inference ..Statistical inferenceis the process by which we inferpopulation properties from sample are two types of statistical inference: Estimation Hypotheses TestingThe concepts involved are actually very similar, whichwe will see in due course. Below, we provide a basicintroduction to ..The objective of Estimation is to approximate the value ofa population parameter on the basis of a sample example, the sample mean Xis used to estimate thepopulation mean .There are two types of estimators: Point Estimator Interval Estimator2 Point Estimator ..Apoint estimatordraws inferences about a populationby estimating the value of an unknown parameter usinga single value that for a continuous variable, the probability ofassuming any particular value is zero. Hence, we are onlytrying to generate a value that is close to the true estimators typically donotreflect the effects oflarger sample sizes, whileinterval estimatordo.
2 3 Interval Estimator ..An interval estimator draws inferences about a populationby estimating the value of an unknown parameter usingan interval. Here, we try to construct anintervalthat covers the true population parameter with a an example, suppose we are trying to estimate themean summer income of students. Then, an interval es-timate might say that the (unknown) mean income isbetween $380 and $420 with probability of Estimators ..The desirability of an estimator is judged by its charac-teristics. Three important criteria are: Unbiasedness Consistency EfficiencyDetails ..5 Unbiasedness ..Anunbiased estimatorof a population parameter isan estimator whoseexpected valueis equal to that pa-rameter. Formally, an estimator for parameter is saidto be unbiased if:E( ) = .(1)Example: The sample mean Xis an unbiased estimatorfor the population mean , sinceE( X) =.
3 It is important to realize that other estimators for thepopulation mean exist: maximum value in a sample, min-imum value in a sample, average of the maximum and theminimum values in a sample ..Being unbiased is a minimal requirement for an estima-tor. For example, the maximum value in a sample isnotunbiased, and hence should not be used as an estimatorfor .6 Consistency ..An unbiased estimator is said to beconsistentif thedifference between the estimator and the target popula-tion parameter becomes smaller as we increase the samplesize. Formally, an unbiased estimator for parameter issaid to be consistent ifV( ) approaches zero asn .Note that being unbiased is a precondition for an estima-tor to be 1: The variance of the sample mean Xis 2/n,which decreases to zero as we increase the sample , the sample mean is a consistent estimator for .Example 2: The variance of the average oftworandomly-selected values in a sample doesnotdecrease to zero aswe increasen.
4 This variance in fact stays constant!7 Efficiency ..Suppose we are given two unbiased estimators for a pa-rameter. Then, we say that the estimator with a smallervariance is 1: For anormallydistributed population, itcan be shown that the sample median is an unbiased es-timator for . It can also be shown, however, that thesample median has a greater variance than that of thesample mean, for the same sample size. Hence, Xis amore efficient estimator than sample 2: Consider the following estimator. First, arandom portion of a sample is discarded from an origi-nal sample; then, the mean of the retained values in thesample is taken as an estimate for . This estimator is un-biased, but is not as efficient as using intuitive reasoning is that we are not fully utilizingavailable information, and hence the resulting estimatorhas a greater When 2is Known.
5 Constructing point estimates using the sample mean Xis the best (according to our criteria above) estimatorfor the population mean .Suppose the variance of a population is known. Howdoes one construct anintervalestimate for ?The key idea is that from the central limit theorem, weknow that whennis sufficiently large, the standardizedvariableZ= X / nfollows the standard normal distribution. It is importantto realize that this is true even though we donotknowthe value of . The value of , however, is assumed tobe given (this assumption, which could be unrealistic, willbe relaxed later).9It follows that for a given , we haveP( z /2 n< X +z /2 n)= 1 .Since our unknown is actually , the above can berearranged into:P( X z /2 n< X+z /2 n)= 1 .That is, theprobabilityfor the interval( X z /2 n, X+z /2 n)(2)to contain, or to cover, the unknown population mean is 1 ; and we now have a so-calledconfidence in-tervalfor.
6 Note that the interval estimator (2) is con-structed from X,z /2, , andn, all of which are user-specified value 1 is called theconfidencelevelorcoverage , we have LowerConfidenceLimitUpperConfidenceLimit ConfidenceLevel 2 2 Interpretation:If the interval estimator (2) is usedrepeatedlyto estimatethe mean of a given population, then 100(1 )% ofthe constructed intervals will cover .The often-heard media statement 19 times out of 20 refers to a confidence level of Such a statement isgood, since it emphasizes the fact that we are correct only95% of the : Demand during Lead TimeA computer company delivers computers directly to cus-tomers who order via the Internet. To reduce inven-tory cost, the company employs an inventory model requires information about the mean de-mand during delivery lead time between a centralmanufacturing facility and local experience indicates that lead-time demand is nor-mally distributed with a standard deviation of 75 com-puters per lead time (which is also random).
7 Construct the 95% confidence interval for the mean de-mand. Demand data for a sample of 25 lead-timeperiods are given in the file : Since 1 = , we have = and hence /2 = , for From the givendata file, we obtain the sample mean X= confidence interval is therefore (see (2))( 25, + 25)or simply ( , ).12 Width of Confidence Interval ..Suppose we are told that with 95% confidence that theaverage starting salary of accountants is between $15,000and $100,000. Clearly, this provides little information,despite the high confidence , suppose instead: With 95% confidence that the av-erage starting salary of accountants is between $42,000and $45, second statement of course offers more precise infor-mation. Thus, for a given , the width of a confidenceinterval conveys the extent of precision of the reduce the width, or to increase precision, we can in-crease the sample general, recall that the upper and lower confidencelimits are: X z /2 , the width of the confidence interval is 2z /2 / follows that precision depends on.
8 A smaller implies a wider interval:n) (2nz2025. /2= /2= /2=5% /2=5%n) (2nz205. = Confidencelevel90%95%Letusincreasethecon fidencelevelfrom90%to95%.14 A larger implies a wider interval:90%Confidenceleveln) (2nz205. = /2=.05 /2=. ) ( A largernimplies a narrower interval:90% the Sample Size ..To control the width of the confidence interval, we canchoose a necessary sample size. Formally, suppose wewish to estimate the mean to withinwunits. Thismeans that we wish to construct an interval estimate ofthe form X solving the equationw=z /2 n,we obtainn=(z /2 w)2,the required sample : Tree DiametersA lumber company must estimate the mean diameter oftrees in an area of forest to determine whether or notthere is sufficient lumber to harvest. They need toestimate this to within 1 inch at a confidence level of99%. Suppose the tree diameters are normally dis-tributed with a standard deviation of 6 inches.
9 Whatsample size is sufficient to guarantee this?Solution: The required precision is 1 inch. That is,w= 1. For = , we havez /2= ,n=(z /2 w)2=( 61)2= , we need to sample at least 239 trees to achievea 99% confidence interval of X