Example: quiz answers

Introduction to R - Statistics at UC Berkeley

IntroductiontoRPhilSpectorStatisticalCom putingFacilityDepartment of StatisticsUniversity of California,Berkeley1 SomeBasics Therearethreetypes of datain R:numeric,characterandlogical. R supportsvectors,matrices,listsanddatafra mes. Objectscanbe assignedvaluesusinganequalsign(=) or thespecial<-operator. R is highlyvectorized- almostalloperationsworkequallywellonscal arsandarrays Alltheelements of a matrixor vectormustbe of thesametype Listsprovidea verygeneralway to holda collectionofarbitraryR objects. A dataframeis a crossbetweena matrixanda list{ columns(variables)of a dataframecanbe of di erent types,buttheyallmustbe Typingthenameof any objectwilldisplay a , theprint()functioncanbeusedto display theentireobject.{Element numbersaredisplayed in squarebrackets{Typinga function'snamewilldisplay itsargument listandde nition,butsometimesit'snotveryenlighteni ng. Thestr()functionshowsthestructureof anobject If youdon'tassignanexpressionto anR object,R willdisplaytheresults,buttheyarealsostor edin Functioncallsrequireparentheses,evenif example,typeq()to quitR.}}}

Introduction to R Phil Spector Statistical Computing Facility Department of Statistics University of California, Berkeley 1 Some Basics There are three types of data in R: numeric, character and

Tags:

  Introduction, Data, Introduction to r

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Introduction to R - Statistics at UC Berkeley

1 IntroductiontoRPhilSpectorStatisticalCom putingFacilityDepartment of StatisticsUniversity of California,Berkeley1 SomeBasics Therearethreetypes of datain R:numeric,characterandlogical. R supportsvectors,matrices,listsanddatafra mes. Objectscanbe assignedvaluesusinganequalsign(=) or thespecial<-operator. R is highlyvectorized- almostalloperationsworkequallywellonscal arsandarrays Alltheelements of a matrixor vectormustbe of thesametype Listsprovidea verygeneralway to holda collectionofarbitraryR objects. A dataframeis a crossbetweena matrixanda list{ columns(variables)of a dataframecanbe of di erent types,buttheyallmustbe Typingthenameof any objectwilldisplay a , theprint()functioncanbeusedto display theentireobject.{Element numbersaredisplayed in squarebrackets{Typinga function'snamewilldisplay itsargument listandde nition,butsometimesit'snotveryenlighteni ng. Thestr()functionshowsthestructureof anobject If youdon'tassignanexpressionto anR object,R willdisplaytheresults,buttheyarealsostor edin Functioncallsrequireparentheses,evenif example,typeq()to quitR.}}}

2 Squarebrackets([ ]) areusedforsubscripting,andcanbeappliedto any R c()- allowsdirectentryof smallvectorsin programs. scan()- readsdatafroma le,a URL,or thekeyboardinto avector.{Canbe embeddedin a calltomatrix()orarray().{Usethewhat=argu ment to readcharacterdata. readsfroma leor URLinto a dataframe.{sep=allowsa eldseparatorotherthanwhitespace.{header= speci esif the rstlineof the lecontainsvariablenames.{ characterto factorconversion{ () ()(comma-separatedvalues), ()(tab-separatedvalues), ( xedwidthformatteddata). data ()- readspreloadeddatasetsinto storesyourdataEach timeyoustartR,it looksfora thecurrent directory. If it doesn'texistit easy- changeto a di erent directoryforeachdi erent session,youwillbe askedwhetheror notyouwant to save thedata. Youcanusetheobjects()functionto listwhatobjectsexistin yourlocaldatabase,andtherm()functionto remove onesyoudon'twant.}}}}}}

3 YoucanstartR withthe--saveor--no-saveoptionto avoidbeingpromptedeach timeyouexitR. ()functionto save yourdatawhenever youwant5 GettingHelpTo viewthemanualpageforany R function,usethehelp(functionname)command ,which canbe abbreviatedbyfollowinga questionmark(?) by ("topic")commandwilloftenhelpyougetstart edif youdon'tknow thenameof a ()willopen a browserpointingto avariety of (locallystored)informationaboutR,includi nga searchengineandaccessto open,allhelprequestswillbe displayed in functionshave examples,availablethroughtheexample()fun ction;generaldemonstrationsof R capabilitiescanbe seenthroughthedemo() R provideroutinesfora largevariety of somethingseemsto be missingfromR,it is mostlikelyavailablein a () briefdescriptionof thelibraryusinglibrary(help=libraryname) Finally, youcanloada librarywiththecommandlibrary(libraryname )Many librariesareavailablethroughtheCRAN(Comp rehensizeRArchive Network) ()function,or througha menu itemin if youdon'thave administrative PathWhenyoutype a nameinto theR interpreter,it checks throughseveraldirectories,knownas thesearch path,to determinewhatobjectto pathwiththecommandsearch().

4 To ndthenamesof alltheobjectsin a directoryonthesearch path,typeobjects(pos=num), wherenumis thenumericalpositionof thedirectoryonthesearch databaseto thesearch pathwiththeattach() make objectsfroma previoussessionof R available,passattach()thelocationof referto theelements of a dataframeor listwithouthavingto retype theobjectname,passthedataframeor listtoattach(). (Youcantemporarilyavoidhavingto retype theobjectnameby usingthewith()function.)8 Sizesof ObjectsThenchar()functionreturnsthenumbe r of charactersin numericdata,it returnsthenumber of charactersin theprintedrepresentationof ()functionreturnsthenumber of elements in itsargument. Notethat,fora matrix,length()willreturnthetotalnumber of elements in thematrix,whilefora dataframeit willreturnthenumber of columnsin arrays,thedim()functionreturnsa a matrix,it returnsa vectorof lengthtwo withthenumber of rowsandnumber of convenience,thenrow()andncol()functionsc anbe usedto geteitherdimensionof a matrixdirectly.

5 For non-arraysdim() ()function,calledwithnoarguments,prints theobjectsin wheretheobjectsyoucreatewillbe allowsyoulookin otherelements of yoursearch allowsyouto restrictthesearchto objectswhosenamematchesa toTRUE willdisplay objectnameswhich beginwithaperiod, which wouldotherwisebe ()functionacceptsa regularexpression,andreturnsthenamesof objectsanywherein yoursearch pathwhich ()andassign()Sometimesyouneedto retreive anobjectfroma speci cdatabase,temporarilyoveridingR'ssearch ()functionacceptsa characterstringnaminganobjectto be retreived,andapos=argument, specifyingeithera positiononthesearch pathor thenameof thesearch pathelement. SupposeI have anobjectnamedxin a databasestoredinrproject/.RData. I canattach thedatabaseandgettheobjectas follows:> attach("rproject/.RData")> search()[1]".GlobalEnv""file:rproject/.R data ""package:methods"[4]"package:stats" "package:graphics""package:grDevices"[7] "package:utils""package:datasets""Autolo ads"[10]"package:base"> get("x",2)Theassign()functionsimilarlyle tsyoustoreanobjectin ()functionattemptsto combineobjectsin themostgeneralway.

6 For example,if we combinea matrixanda vector,theresultis a vector.> c(matrix(1:4,ncol=2),1:3)[1]1 2 3 4 1 2 3 Notethatthelist()functionpreserves theidentity of each of itselements:> list(matrix(1:4,ncol=2),1:3)[[1]][,1][,2 ][1,]13[2,]24[[2]][1]1 2 312 CombiningObjects(cont'd)Whenthec()functi onis appliedto lists,it willreturna list:> c(list(matrix(1:4,ncol=2),1:3),list(1:5) )[[1]][,1][,2][1,]13[2,]24[[2]][1]1 2 3[[3]][1]1 2 3 4 5To breakdownanythinginto itsindividualcomponents,usetherecursive= TRUE argument ofc():> c(list(matrix(1:4,ncol=2),1:3),recursive =TRUE)[1]1 2 3 4 1 2 3 Theunlist()andunclass()functionsmay alsobe R providesoneof themoste ective ways tomanipulateandselectdatafromvectors,mat rices, supportsseveraltypes of subscripts: Empty subscripts- allow modi cationof = 1createsa newscalar,x, witha valueof 1, whilex[]= 1changeseach valueofxto subscriptsalsoallow referingto thei-thcolumnof adataframeor matrixasmatrix[i,]or thej-throw asmatrix[,j].

7 Positive numericsubscripts- worklike mostcomputerlanguagesThesequenceoperator (:) canbe usedto referto contigiousportionsof anobjectonboththeright- andleft-handsideofassignments;arrays canbe usedto referto (cont'd) Negative numericsubscripts- allow exclusionof selectedelements Zerosubscripts- subscriptswitha valueof zeroareignored Charactersubscripts- usedas analternative to numericsubscriptsElements of R objectscanbe ()forvectorsor lists,dimnames(),rownames()orcolnames() listsanddataframes,thenotationobject$nam ecanalsobe used. Logicalsubscripts- powerfultool forsubsettingandmodifyingdataA vectorof logicalsubscripts,withthesamedimensionsa stheobjectbeingsubscripted,willoperateon thoseelementsforwhich :A matrixindexedwitha singlesubscriptis treatedas avectormadeby stackingthecolumnsof SubscriptingOperationsSupposexis a 5 3 matrix,withcolumnnamesde nedbydimnames(x)= list(NULL,c("one","two","three"))x[3,2]i s theelement in the3rdrow [,1]is the [3,]is [3.]

8 5,c(1,3)]is a 3 2 matrixderived fromthelastthreerows,andcolumns1 and3 [-c(1,3,5),]is a 2 3 matrixcreatedby removingrows1, 3 [x[,1]>2,]is a matrixcontainingtherowsofxforwhich the rstcolumnofxis [,c("one","three")]is a 5 2 matrixwiththe rstandthirdcolumnsofx16 MoreonSubscriptsBydefault,whenyouextract a singlecolumnfroma ,it becomesa simplevector,which may ,if thecolumnwas named, prevent thisfromhappening,youcanpassthedrop=TRUE argument to thesubscriptoperator:> mx = matrix(c(1:4,4:7,5:8),ncol=3,byrow=TRUE, + dimnames=list(NULL,c("one","two","three" )))> mx[,3][1]3 5 5 8> mx[,3,drop=FALSE]three[1,]3[2,]5[3,]5[4, ]817[[SubscriptingOperatorA generalprinciplein R is thatsubscriptedobjectsretainthemodeof theirparent. For vectorsandarrays,thisrarelycausesaproble m,butforlists(anddataframestreatedlike lists),R willoftenhave problemsworkingdirectlywithsuch objects.> mylist= list(1:10,10:20,30:40)> mean(mylist[1])[1]NAWarningmessage:argum entis notnumericor logical:returningNA (mylist[1])For thispurpose,R providesdoublebracketsforsubscripts,whic hextracttheactuallistelement, nota listcontainingtheelement:> mean(mylist[[1]])[1] namedlists,theproblemcanalsobe avoidedusingthe$ dataframeis a crossbetweena matrixanda list,subscriptingoperationsfordataframes areslightlydi erent thanforeitherof thosetypes.]

9 Doublesubscriptsin a dataframebehave exactlyas withmatrices. Singlesubscriptsin a dataframereferto a extracttheactualcolumn(s),usedoublebrack etsor anempty rstsubscript. A dollarsign($) canbe usedto separatea dataframenamefromacolumnnameto thecolumnnamehasspecialcharacters,it mustbe surroundedby [["name"]],x[,"name"]andx$nameareallequi valent, butx["name"]is a [1,"name"],x[1,]$nameorx[1,]["name"]alla ccessnameforthe a two columnmatrixis usedas a subscriptof a matrix,itsrowsareinterpretedas row andcolumnnumbers,forbothaccessingandassi gningvalues.> a = matrix(c(1,1,1,2,2,3),ncol=2,byrow=TRUE) > a[,1][,2][1,]11[2,]12[3,]23> x = matrix(0,nrow=4,ncol=3)> x[a]= c(10,20,30)> x[,1][,2][,3][1,]10200[2,]0030[3,]000[4, ]00020 Type ConversionFunctionsOccasionallyit is necessaryto treatanobjectas if it wereof adi erent type. Oneof themostcommoncasesis to treatacharactervalueas if it werea ()functiontakes careof this,andin generalthereare\as.

10 " functionsformosttypes of objectsencounteredin completelistcanbe seenwithapropos('^as.'); (), (), (), (), (), ().Thesefunctionsdonotpermanentlychanget hetype of type conversionincluderound()andtrunc()fornum ericvalues,andpaste() c()- combinesvalues,vectors,and/orliststo createnewobjects. unique()- returnsa vectorcontainingoneelement foreachuniquevaluein thevector duplicated()- returnsa logicalvectorwhich tellsif elementsof a vectorareduplicatedwithregardto previousones. rev()- reversetheorderof elements in a vector sort()- sortstheelements in a vector. append()- appendor insertelements in a vector. sum()- sumof theelements of a vector min()- minimumvaluein a vector max()- maximumvaluein a vector22 MissingValuesIn R,missingvaluesarerepresentedby thestringNA. Youcanassigna missingvalueby settinga variableequaltoNA, ()functionto testfora allcalculations,so thepresenceofeven a singlemissingvaluecanresultin a variety of to , uselogicalsubscriptingto easilyextractnon-missingvalues:> values= c(12,NA,19,15,12,17,14,NA,19)> values[!]


Related search queries