### Transcription of The Truncated Normal Distribution - People

1 The **Truncated** **Normal** DistributionJohn BurkardtDepartment of Scientific ComputingFlorida State jburkardt/ October 2014 AbstractThe **Normal** **Distribution** is a common model of randomness. Unlike the uniform **Distribution** , itproposes a most probable value which is also the mean, while other values occur with a probabilitythat decreases in a regular way with distance from the mean. This behavior is mathematically verysatisfying, and has an easily observed correspondence with many physical processes. One drawback ofthe **Normal** **Distribution** , however, is that it supplies a positive probability density to every value in therange ( ,+ ), although the actual probability of an extreme event will be very low. In many cases, itis desired to use the **Normal** **Distribution** to describe the random variation of a quantity that, for physicalreasons, must be strictly positive.

2 A mathematically defensible way to preserve the main features of thenormal **Distribution** while avoiding extreme values involves thetruncated **Normal** **Distribution** , in whichthe range of definition is made finite at one or both ends of the interval. It is the purpose of thisreport to describe the truncation process, to consider how certain basic statistical properties of the newdistribution can be determined, to show how to efficiently sample the **Distribution** , and how to constructan associated quadrature rule, or even a sparse grid quadrature rule for a problem with The Standard **Normal** Mathematical Definition .. The Mean and Variance .. The Cumulative **Distribution** Function .. The Inverse Cumulative **Distribution** Function .. Sampling the **Normal** **Distribution** .

3 Moments of the Standard **Normal** .. Central Moments and the Variance .. Quadrature Rule Computation .. Orthogonal Polynomial Family .. The Golub Welsch Procedure .. Quadrature Example .. Product Rules .. Sparse Grid Rules ..122 The General **Normal** Mathematical Definition .. The Mean and Variance .. Mapping to and from the Standard **Normal** .. The Cumulative **Distribution** Function .. The Inverse Cumulative **Distribution** Function .. Sampling the General **Normal** **Distribution** .. Moments of the General **Normal** .. Central Moments of the General **Normal** .. Quadrature Rules, Product Rules, Sparse Grid Rules ..193 The **Truncated** **Normal** Mathematical Definition .. Effect of the Truncation Range .. The Cumulative Density Function.

4 The Inverse Cumulative Density Function .. Sampling the **Truncated** **Normal** **Distribution** .. The Mean .. The Variance .. Moments .. Central Moments .. Quadrature Rule Computation .. Experiment with the Quadrature Rule .. The Multidimensional Case .. Experiment with the Product Rule .. Definition of a Sparse Grid Rule .. Implementation of a Sparse Grid Rule .. Experiment with the Sparse Grid .. Software .. Conclusion ..341 The Standard **Normal** Mathematical DefinitionThe standard **Normal** **Distribution** is a probability density function (PDF) defined over the interval ( ,+ ).The function is often symbolized as (0,1;x). It may be represented by the following formula: (0,1;x) =1 2 e x22 Like any PDF associated with a continuous variable, (0,1;x) may be interpreted to assert that theprobability that an objectx, randomly drawn from a group that obeys the standard **Normal** **Distribution** ,will have a value that falls between the valuesaandbis:Pr(a x b) = ba (0,1;x) The Mean and VarianceThemeanof a **Distribution** (x), symbolized by or mean( ( )), may be thought of as the average over allvalues in the range.

5 If we assume the range is (a,b), then it is defined as the following weighted integral:mean( ()) = bax (x)dx2 Figure 1:The standard **Normal** PDFB ecause the standard **Normal** **Distribution** is symmetric about the origin, it is immediately obvious thatmean( (0,1; )) = a **Distribution** (x), symbolized by var( ( )) is a measure of the average squared distancebetween a randomly selected item and the mean. Assuming the mean is known, the variance is defined as:var( ( )) = ba(x )2 (x)dxFor the standard **Normal** **Distribution** , we have that var( (0,1; )) = that thestandard deviationof any **Distribution** , represented by std( ( )), is simply the square rootof the variance, so for the standard **Normal** **Distribution** , we also have that std( (0,1; )) = The Cumulative **Distribution** FunctionRecall that any probability density function (x) can be used to evaluate the probability that a randomvalue falls between given limitsaandb:Pr(a x b) = ba (x)dxAssuming that our values range over the interval ( ,+ ), we may define the functionF( ;b), the prob-ability that a random value is less than or equal tob:F( ;b) = Pr(x b) = b (x)dxIf it is possible to evaluateF(.)

6 B) or to tabulate it at regular intervals, we can use this function to computethe probability of any interval, sincePr(a x b) = ba (x)dx=F( ;b) F( ;a)A function likeF( ;x) is known as thecumulative density functionor CDF for the corresponding PDF (x).3 Figure 2:The standard **Normal** CDFF igure 3:The error function ERFIn the case of the standard **Normal** **Distribution** , the CDF is denoted by (0,1;x), and is defined by (0,1;x) = x 1 2 e t22dtThere is no simple formula to evaluate the **Normal** CDF. Instead, there are extensive tables and compu-tational algorithms. One common approach is based on a relationship with theerror function. The errorfunction, symbolized byerf(x), is defined byerf(x) =1 x xe t2dtThus, the error function can be related to the CDF of the standard **Normal** **Distribution** : (0,1;x) =12(1 + erf(x 2))so if an automatic procedure is available to evaluate erf(x), it is easy to evaluate (0,1;x) as well.

7 Forinstance, MATLAB has a built-in functionerf(x)and Mathematica hasErf[x].Software for directly evaluating the standard **Normal** CDF includes Algorithm AS 66 by David Hill[10].4 Figure 4:The standard **Normal** inverse The Inverse Cumulative **Distribution** FunctionBecause the standard **Normal** PDF is everywhere positive and integrable, it follows that the CDF (0,1;x)is a strictly monotone function on ( ,+ ) which takes on, exactly once, every value in the open interval(0,1). This implies the existence of an inverse cumulative density function (iCDF), denoted by 1(0,1;p),defined on (0, 1) and returning values in ( ,+ ), such that 1(0,1; (0,1;x)) =x (0,1; 1(0,1;p)) =pThe inverse CDF allows us to start with a probability 0< p <1, and return a cutoff value 1(p) =x, suchthat the probability of a value that is less than or equal toxis preciselyp.

8 We will see in a moment howaccess to such a function allows us to appropriately sample the density statistics, the inverse CDF of the **Normal** **Distribution** is sometimes referred to as the percentagepoints of the of the relationship between the **Normal** CDF and the error function, the inverse error functioncan be used to evaluate the iCDF. In particular, we have:p= (0,1;x) =12(1 + erf(b 2))2p 1 =erf(x 2))erf 1(2p 1) =b 2 2 erf 1(2p 1) =xx= 1(0,1;p) = 2 erf 1(2p 1)and thus, if we have access to an inverse error function, we can compute the inverse of the standard normalCDF as error functions are available, for instance, in MATLAB aserfinv(), and in Mathematica asInverseErf[].Software to directly compute the inverse CDF of the standard **Normal** **Distribution** includes AppliedStatistics Algorithm 111 by Beasley and Springer[1], Applied Statistics Algorithm 241 by Wichura[13], andthe software package CDFLIB by Barry Brown, James Lovato, and Kathy Russell[2].

9 5 Figure 5:200,000 sample values in 25 Sampling the **Normal** DistributionSampling a **Distribution** means to select one item from the range of legal values, using the PDF as theprobability of selection. A histogram of the selected data should roughly approximate the shape of a graphof the we have some functionrand()which is a source of uniform random numbers in the range(0,1), and that we have a means of evaluating 1(p), it is straightforward to sample the standard PDF asfollows:p=rand()x= 1(0,1;p) Moments of the Standard NormalThek-th moment of a PDF (x), which may be denoted k( ( )), is the weighted integral ofxkover therange of the PDF: k( ( )) = baxk (x)dxIn particular, 0= 1 (because ( ) is a PDF) and 1= mean( ( )), the mean value of the the standard **Normal** PDF is symmetric about the origin, all the moments of odd index are general formula is k( (0,1; )) ={0ifkis odd;(k 1)!}

10 ! = (k 1) (k 3).. 3 1 ifkis , the notation (k 1)!! indicates the double factorial Central Moments and the VarianceThek-th central moment of a PDF (x), which may be denoted k( ( )), is the weighted integral of thedifference (x )kover the range of the PDF: k( ( )) = ba(x )k (x)dx6In particular, 2( ( )) = var( ( )).Because the standard **Normal** **Distribution** has zero mean, the central moments are the same as themoments, and so k( (0,1; )) ={0ifkis odd;(k 1)!! = (k 1) (k 3).. 3 1 ifkis particular, we note that 2( (0,1; )) = var( (0,1; )) = Quadrature Rule ComputationWe expect to encounter integrals of the formI(f) = + f(x) (0,1;x)dxand we wish to be able to approximate such integrals by using aquadrature quadrature rule for the **Normal** PDF (0,1;x) is a set ofnpointsxiand weightswifor which we canmake the integral estimate: + f(x) (0,1;x)dx=I(f) Q(f) =n i=1wi f(xi)A quadrature rule is said to haveprecisionkifI(xj) =Q(xj) for integers 0 j k.}