Example: dental hygienist

Statistical Data Analysis - Sherry Towers

Statistical data Analysis GLEN COWAN University of Siegen CLARENDON PRESS OXFORD 1998 Oxford University Press, Great Clarendon Street, Oxford OX2 6DP Oxford New York Athens Auckland Bangkok Bogota Bombay Buenos Aires Calcutta Cape Town Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madras Madrid Melbourne Mexico City Nairobi Paris Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan Oxford is a registered trade mark of Oxford University Press Published in the United States by Oxford University Press Inc., New York Glen Cowan, 1998 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press. Within the UK, exceptions are allowed in respect of any fair dealing for the purpose of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, or in the case of repro graphic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency.

of analysis usually encountered in particle physics. Here the data usually consist of a set of observed events, e.g. particle collisions or decays, as opposed to the data of a radio astronomer, who deals with a signal measured as a function of time. The topic of time series analysis is therefore omitted, as is analysis of variance.

Tags:

  Analysis, Data, Statistical, Variance, Analysis of variance, Statistical data analysis

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Statistical Data Analysis - Sherry Towers

1 Statistical data Analysis GLEN COWAN University of Siegen CLARENDON PRESS OXFORD 1998 Oxford University Press, Great Clarendon Street, Oxford OX2 6DP Oxford New York Athens Auckland Bangkok Bogota Bombay Buenos Aires Calcutta Cape Town Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madras Madrid Melbourne Mexico City Nairobi Paris Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan Oxford is a registered trade mark of Oxford University Press Published in the United States by Oxford University Press Inc., New York Glen Cowan, 1998 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press. Within the UK, exceptions are allowed in respect of any fair dealing for the purpose of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, or in the case of repro graphic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency.

2 Enquiries concerning reproduction outside those terms and in other countries should be sent to the Rights Department, Oxford University Press, at the address above. This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form of binding or cover other than that in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser. A catalogue recordfor this book is availablefrom the British Library Library of Congress Cataloging in Publication data ( data available) ISBNO 19 850156 O(Hbk) ISBNO 19 850155 2(Pbk) Typeset by the author Printed in Great Britain by Bookcraft (Bath) Ltd Midsomer Norton, Avon Preface The following book is a guide to the practical application of statistics in data Analysis as typically encountered in the physical sciences, and in particular in high energy particle physics.

3 Students entering this field do not usually go through a formal course in probability and statistics, despite having been ex-posed to many other advanced mathematical techniques. Statistical methods are invariably needed, however, in order to extract meaningful information from experimental data . The book originally developed out of work with graduate students at the European Organization for Nuclear Research (CERN). It is primarily aimed at graduate or advanced undergraduate students in the physical sciences, especially those engaged in research or laboratory courses which involve data Analysis . A number of the methods are widely used but less widely understood, and it is therefore hoped that more advanced researchers will also be able to profit from the material. Although most of the examples come from high energy particle physics, an attempt has been made to present the material in a reasonably general way so that the book can be useful to people in most branches of physics and astronomy.

4 It is assumed that the reader has an understanding of linear algebra, multi-variable calculus and som; knowledge of complex Analysis . No prior knowledge of probability and statistics, however, is assumed. Roughly speaking, the present book is somewhat less theoretically oriented than that of Eadie et al. [Ead71]' and somewhat more so than those of Lyons [Ly086] and Barlow [Bar89]. The first part of the book, Chapters 1 through 8, covers basic concepts of probability and random variables, Monte Carlo techniques, Statistical tests, and methods of parameter estimation. The concept of probability plays, of course, a fundamental role. In addition to its interpretation as a relative frequency as used in classical statistics, the Bayesian approach using subjective probability is discussed as well. Although the frequency interpretation tends to dominate in most of the commonly applied methods, it was felt that certain applications can be better handled with Bayesian statistics, and that a brief discussion of this approach was therefore justified.

5 The last three chapters are somewhat more advanced than those preceding. Chapter 9 covers interval estimation, including the setting of limits on parame-ters. The characteristic function is introduced in Chapter 10 and used to derive a number of results which are stated without proof earlier in the book. Finally, Chapter 11 covers the problem of unfolding, the correcting of distributions for effects of measurement errors. This topic in particular is somewhat special-vi Preface ized, but since it is not treated in many other books it was felt that a discussion of the concepts would be found useful. An attempt has been made to present the most important concepts and tools in a manageably short space. As a consequence, many results are given without proof and the reader is often referred to the literature for more detailed explanations.

6 It is thus considerably more compact than several other works on similar topics, those by Brandt [Bra92] and Frodeson et aJ. [Fr079]. Most chapters employ concepts introduced in previous ones. Since the book is relatively short, however, it is hoped that readers will look at least briefly at the earlier chapters before skipping to the topic needed. A possible exception is Chapter 4 on Statistical tests; this could by skipped without a serious loss of continuity by those mainly interested in parameter estimation. The choice of and relative weights given to the various topics reflect the type of Analysis usually encountered in particle physics. Here the data usually consist of a set of observed events, particle collisions or decays, as opposed to the data of a radio astronomer, who deals with a signal measured as a function of time.

7 The topic of time series Analysis is therefore omitted, as is Analysis of variance . The important topic of numerical minimization is not treated, since computer routines that perform this task are widely available in program libraries. At various points in the book, reference is made to the CERN program li-brary (CERNLIB) [CER97], as this is the collection of computer sofware most accessible to particle physicists. The short tables of values included in the book have been computed using CERNLIB routines. Other useful sources of statistics software include the program libraries provided with the books by Press et al. [Pre92] and Brandt [Bra92]. Part of the material here was presented as a half-semester course at the University of Siegen in 1995. Given the topics added since then, most of the book could be covered in 30 one-hour lectures.

8 Although no exercises are included, an evolving set of problems and additional related material can be found on the book's World Wide Web site. The link to this site can be located via the catalogue of the Oxford University Press home page at: The reader interested in practicing the techniques of this book is encouraged to implement the examples on a computer. By modifying the various parameters and the input data , one can gain experience with the methods presented. This is particularly instructive in conjunction with the Monte Carlo method (Chapter 3), which allows one to generate simulated data sets with known properties. These can then be used as input to test the various Statistical techniques. Thanks are due above all to Sonke Adlung of Oxford University Press for encouraging me to write this book as well as for his many suggestions on its con-tent.

9 In addition I am grateful to Professors Sigmund Brandt and Claus Grupen of the University of Siegen for their support of this project and their feedback on the text. Significant improvements were suggested by Robert Cousins, as Preface vii well as by many of my colleagues in the ALEPH collaboration, including Klaus Aftbolderbach, Paul Bright-Thomas, Volker Buscher, Gunther Dissertori, Ian Knowles, Ramon Miquel, Ian Tomalin, Stefan Schael, Michael Schmelling and Steven Wasserbaech. Last but not least I would like to thank my wife Cheryl for her patient support. Geneva August 1997 Contents Notation Xlll 1 Fundamental concepts 1 Probability and random variables 1 Interpretation of probability 4 Probability as a relative frequency 4 Subjective probability 5 Probability density functions 7 Functions of random variables 13 Expectation values 16 Error propagation 20 Orthogonal transformation of random variables 22 2 Examples of probability func_tions 26 Binomial and multinomial distributions 26 Poisson distribution 29 Uniform distribution 30 Exponential disfribution 31 Gaussian distribution 32 Log-normal distribution 34 Chi-square distribution 35 Cauchy (Breit-Wigner)

10 Distribution 36 Landau distribution 37 3 The Monte Carlo method 40 Uniformly distributed random numbers 40 The transformation method 41 The acceptance-rejection method 42 Applications of the Monte Carlo method 44 4 Statistical tests 46 Hypotheses, test statistics, significance level, power 46 An example with particle selection 48 Choice of the critical region using the Neyman-Pearson lemma 50 Constructing a test statistic 51 X Contents Linear test statistics, the Fisher discriminant func-tion 51 Nonlinear test statistics, neural networks 54 Selection of input variables 56 Goodness-of-fit tests 57 The significance of an observed signal 59 Pearson 's X2 test 61 5 General concepts of parameter estimation 64 Samples, estimators, bias 64 Estimators for mean, variance , covariance 66 6 The method of maximum likelihood 70 ML estimators 70 Example of an ML estimator: an exponential distribution 72 Example of ML estimators: and 0-2 of a Gaussian 74 variance of ML estimators: analytic method 75 variance of ML estimators: Monte Carlo method 76 variance of ML estimators: the RCF bound 76 variance of ML estimators.


Related search queries