Example: barber

An Introduction to Biostatistics Using R - Waveland

G L OV E R & M I T C H E L LA N I N T R O D U C T I O N TOB I O S TAT I S T I C S U S I N G R 2001 2015, Kevin Mitchell and Thomas work is licensed under an License [ ]. (This license allows you to redistribute this book in unmodified form for non-commercial purposes. No Derivatives: If you remix, transform, or build upon thematerial, you may not distribute the modified material. See the license full details.)Q U E P O R E L H I L O S E S A C A R E L OV I L L O.( B Y A S M A L L S A M P L E , W E M AY J U D G E O F T H E W H O L E P I E C E . )M I G U E L D E C E RVA N T E S ,D O N Q U I X O T EContents0. Introduction toR11. Introduction to Data Analysis132. Introduction to Probability273. Probability Distributions294. Sampling Distributions416. One-Sample Tests of Hypothesis517. Tests of Hypothesis Involving Two Tests of Hypothesis: ANOVA799.

an introduction to biostatistics using r 3 One can do basic arithmetic in R. For example, we can add 4 and 5 in the obvi-ous way. Be sure to hit “Return” after typing. >4+5 ## [1] 9 The lightly shaded background above and the distinctive typeface distinguish R

Tags:

  Introduction, Biostatistics, Introduction to biostatistics

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of An Introduction to Biostatistics Using R - Waveland

1 G L OV E R & M I T C H E L LA N I N T R O D U C T I O N TOB I O S TAT I S T I C S U S I N G R 2001 2015, Kevin Mitchell and Thomas work is licensed under an License [ ]. (This license allows you to redistribute this book in unmodified form for non-commercial purposes. No Derivatives: If you remix, transform, or build upon thematerial, you may not distribute the modified material. See the license full details.)Q U E P O R E L H I L O S E S A C A R E L OV I L L O.( B Y A S M A L L S A M P L E , W E M AY J U D G E O F T H E W H O L E P I E C E . )M I G U E L D E C E RVA N T E S ,D O N Q U I X O T EContents0. Introduction toR11. Introduction to Data Analysis132. Introduction to Probability273. Probability Distributions294. Sampling Distributions416. One-Sample Tests of Hypothesis517. Tests of Hypothesis Involving Two Tests of Hypothesis: ANOVA799.

2 Two-Factor Analysis9110. Linear Regression and Correlation10111. Goodness of Fit Tests for Categorical Data111 Index of Examples143 Index of Terms and Commands1450. Introduction toRWe assume that your are reading this supplement toAn Introduction to Biostatis-ticsbecause your instructor has decided to useRas the statistical software foryour course or because you are a very motivated student and want to learn bothelementary statistics andRat the same time. This supplement does not providescreen shots of the program or lots of help installingR. This is better done by yourinstructor or TA who can actually demonstrate the isR?Ris a language and environment for statistical computing and graphics. It is avail-able as Free Software under the terms of the Free Software Foundation s GNUG eneral Public License in source code form. It compiles and runs on a wide vari-ety of operating systems such as Windows, MacOS, and most UNIX platforms andsimilar systems (including FreeBSD and Linux).

3 While there are many statisticalsoftware programs available, we have chosen to provide this supplement toAnIntroduction to BiostatisticsusingRbecause it is both powerful and learn much more aboutR, go Note: If you arereading this supplement either online or as a pdf on your own computer or tablet,just click on any link to be redirected to the appropriate web ofRIf you don t already haveRon your computer or have access to a copy ofRthroughyour institution, then you can downloadRfor free by going Follow the instructions. We assume your instructor or TA or afriend will help you with this. There are usefulyoutubevideos available, as and Getting StartedThere are several great free online resources to help you learn more aboutR. Startwith theR-project homepage Here are a few othersthat are written for those just beginning to John Verzani a113page Introduction toRthat is well worth The web pages Software Resources forR starting useful, especially the Getting Started link, glover&mitchell3.

4 A brief R: A self-learn tutorial may be found A series of tutorials by Google Developers consisting of21short videos (totallength, one hour and seven minutes) can be found You may wish towatch the first few for tips on getting started. Return to the later videos as yougain more experience Kelly Black s extensive online R Tutorial can be found Another100-page Introduction toRby Germ n Rod guez can be found For a very short Introduction toRand theR-Studio graphical interface + keep things simple, we will not assume that you are are many other introductions toRavailable online and more being writ-ten all the time. Make use of that you have installedRand that you have started the program, youare faced with some text similar toR version (2014-07-10) -- "Sock it to Me"Copyright (C) 2014 The R Foundation for Statistical ComputingPlatform: (64-bit)R is free software and comes with ABSOLUTELY NO are welcome to redistribute it under certain 'license()' or 'licence()' for distribution language support but running in an English localeR is a collaborative project with many 'contributors()' for more information and'citation()' on how to cite R or R packages in 'demo()' for some demos, 'help()' for on-line help, or' ()' for an HTML browser interface to 'q()' to quit R.

5 >The>is called theprompt. In the examples that follow, the prompt is not some-thing you type. Rather, it is the symbol thatRuses to indicate that it is waiting foryou to type in the next command. If a command line is too long to fit on a singleline a+is automatically inserted byRto indicate the continuation of the prompton the next line. We will remind you about this when it first occurs in these Introduction to Biostatistics Using r 3 One can do basic arithmetic inR. For example, we can add 4 and 5 in the obvi-ous way. Be sure to hit Return after typing.> 4 + 5## [1] 9 The lightly shaded background above and the distinctive typeface distinguishRinput and output from the explanatory text surrounding it. Material following theprompt>is what we typed. Lines that begin with two hashtags##are the outputthat results from anRcommand.

6 If you are looking at a color version of thesematerials, you will also see that the various types ofRinput are color other basic mathematical operations work the same way: use-for subtrac-tion,*for multiplication, and/for has many built in example, to find the square root of 3, use thesqrt( )function:>sqrt(3)## [1] and Accessing Data SetsMost statistical calculations require a data set containing several numbers. Thereare several ways to enter a data set. If the data consist of only a few numbers, thec( )function can be used. This function combines or concatenates terms. Supposethe heights of ten male faculty were recorded in cm:171,177,178,175,202,180,192,182,195, and190. These data can be entered into a single object (or variable)calledheight. To do so, type> height <-c(171, 177, 178, 175, 202, 180, 192, 182, 195, 190)> height## [1] 171 177 178 175 202 180 192 182 195 190 There are a few things to notice.

7 All of the values were assigned to a single object calledheight. Try to make thenames of your objects meaningful in the context of the problem or question. Wecould have simply called the datax, but if we looked back at it later we mightnot know to whatxreferred. The assignment operator is an arrow formed by typing<-. The arrow indi-cates that we are assigning the values171,177,178,175,202,180,192,182,195,and190to theRobject (or variable)height. The values ofheightdid not automatically print out. However, typing the nameof the object will cause its value(s) to be printed. The[1]indicates that the output is a vector (a sequence of numbers or otherobjects) and that the first value printed on the row is actually the first value glover&mitchellMany basic functions can be applied directly to entire data sets. For example, totake the square root of eachheightuse>sqrt(height)## [1] ## [9] that when the values of a function are not stored in a variable, the resultis immediately printed.

8 In the second line of the output above, the leading[9]indicates that the first entry on this line is the ninth element ofsqrt(height).There per inch, so to convert the heights to inches, use> # height in inches## [1] ## [9] will often put comments by the commands. These are indicated by a singlehashtag#. They are a convenient way to make notes in your computations. Any-thing following a hashtag, for example# height in inchesabove, is ignored compute the average or mean of the heights above, add all of the heights anddivide this sum by the number of heights. InRuse thesum( )function to add allof the entries in a vector.>sum(height)## [1] 1842To find the number of entries in a vector, use thelength( )function.>length(height)## [1] 10So the mean height is> meanHt <-sum(height)/length(height)> meanHt## [1] that value of the mean height has been put into the variablemeanHtso thatit can be re-used without having to recalculate the value.

9 To square each of theindividual heights use> height^2## [1] 29241 31329 31684 30625 40804 32400 36864 33124 38025 36100 Order of operations is important. Compare the sum of these squared heights tothe square of the sum of the heights. Note the placement of the squaring Introduction to Biostatistics Using r 5>sum(height^2)# sum of the squared heights## [1] 340196>sum(height)^2# square of the sum of the heights## [1] 3392964> (sum(height))^2# square of the sum of the heights, again## [1] 3392964 The two numbers are different. The extra set of parentheses in the third com-mand above clarifies the order of operations: sum, then complicated calculations may be formed from basic calculations. Forexample, thecorrected sum of squaresis defined asn i=1(Xi X) that we calculated themeanHtearlier, the corrected sum of squares for theheight data can be calculated.

10 >sum((height-meanHt)^2)## [1] is easy to carry out basic descriptive statistical operations on data Using themany functions built intoRas illustrated below.>median(height)# the median height## [1] 181>min(height)# the minimum height## [1] 171>max(height)# the maximum height## [1] 202 Suppose that the arm spans of these same ten faculty were also measured (inthe same order) and that the data were entered intoR.> span <-c(173, 182, 182, 178, 202, 188, 198, 185, 193, 186)# in cm> span## [1] 173 182 182 178 202 188 198 185 193 186To determine the difference between the arm span and height for each personby subtractingheightfromspan, use6 glover&mitchell> difference <- span - height> difference## [1] 2 5 4 3 0 8 6 3 -2 -4It is often convenient to combine data sets that are related. To combinespan,height, anddifferenceinto a single table, put the data into a so-called data frame(which we ) Using ( )command.


Related search queries