Example: bachelor of science

R Data Import/Export

R data Import/ExportVersion (2022-06-23)R Core TeamThis manual is for R, version (2022-06-23).Copyrightc 2000 2022 R Core TeamPermission is granted to make and distribute verbatim copies of this manual providedthe copyright notice and this permission notice are preserved on all is granted to copy and distribute modified versions of this manual underthe conditions for verbatim copying, provided that the entire resulting derived workis distributed under the terms of a permission notice identical to this is granted to copy and distribute translations of this manual into an-other language, under the above conditions for modified versions, except that thispermission notice may be stated in a translation approved by the R Core of Imports.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into an-

Tags:

  Data, Version, Translation, R data

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of R Data Import/Export

1 R data Import/ExportVersion (2022-06-23)R Core TeamThis manual is for R, version (2022-06-23).Copyrightc 2000 2022 R Core TeamPermission is granted to make and distribute verbatim copies of this manual providedthe copyright notice and this permission notice are preserved on all is granted to copy and distribute modified versions of this manual underthe conditions for verbatim copying, provided that the entire resulting derived workis distributed under the terms of a permission notice identical to this is granted to copy and distribute translations of this manual into an-other language, under the above conditions for modified versions, except that thispermission notice may be stated in a translation approved by the R Core of Imports.

2 Encodings.. Export to text files.. XML..52 Spreadsheet-like Variations .. Fixed-width-format files.. data Interchange Format (DIF).. Usingscandirectly.. Re-shaping data .. Flat contingency tables..113 Importing from other statistical EpiInfo, Minitab, S-PLUS, SAS, SPSS, Stata, Systat.. Octave..134 Relational Why use a database?.. Overview of RDBMSs.. data types.. R interface packages.. Packages using DBI.. Package RODBC..185 Binary Binary data formats.. dBase files (DBF)..206 Image Types of connections.. Output to connections.. Input from connections.. Pushback.. Listing and manipulating connections.. Binary connections.. Special values.

3 25ii8 Network Reading from sockets..269 Reading Excel A and variable relational databases part of this manual is based in part on an earlier manual by DouglasBates and Saikat DebRoy. The principal author of this manual was Brian volunteers have contributed to the packages used here. The principal authors of thepackages mentioned areDBI( ):David A. Jamesdataframes2xls( ):Guido van Steenforeign( ):Thomas Lumley, Saikat DebRoy, Douglas Bates, Duncan Murdoch andRoger Bivandgdata( ):Gregory R. Warnesncdf4( ):David PiercerJava( ):Simon UrbanekRJDBC( ):Simon UrbanekRMySQL( ):David James and Saikat DebRoyRNetCDF( ):Pavel MichnaRODBC( ):Michael Lapsley and Brian RipleyROracle( ):David A. JamesRPostgreSQL( ):Sameer Kumar Prayaga and Tomoaki NishiyamaRSPerl:Duncan Temple LangRSPython:Duncan Temple LangRSQLite( ):David A.

4 JamesSJava:John Chambers and Duncan Temple LangWriteXLS( ):Marc SchwartzXLConnect( ):Mirai Solutions GmbHXML( ):Duncan Temple LangBrian Ripley is the author of the support for IntroductionReading data into a statistical system for analysis and exporting the results to some other systemfor report writing can be frustrating tasks that can take far more time than the statistical analysisitself, even though most readers will find the latter far more manual describes the import and export facilities available either in R itself or viapackages which are available fromCRANor otherwise stated, everything described in this manual is (at least in principle) availableon all platforms running general, statistical systems like R are not particularly well suited to manipulations oflarge-scale data .

5 Some other systems are better than R at this, and part of the thrust of thismanual is to suggest that rather than duplicating functionality in R we can make another sys-tem do the work! (For example Therneau & Grambsch (2000) commented that they preferredto do data manipulation in SAS and then use packagesurvival( )in S for the analysis.) Database manipulation systems are often verysuitable for manipulating and extracting data : several packages to interact with DBMSs arediscussed are packages to allow functionality developed in languages such asJava,perlandpythonto be directly integrated with R code, making the use of facilities in these languageseven more appropriate. (See therJava( )pack-age fromCRAN.)It is also worth remembering that R like S comes from the Unix tradition of small re-usabletools, and it can be rewarding to use tools such asawkandperlto manipulate data beforeimport or after export.

6 The case study in Becker, Chambers & Wilks (1988, Chapter 9) is anexample of this, where Unix tools were used to check and manipulate the data before input toS. The traditional Unix tools are now much more widely available, including for manual was first written in 2000, and the number of scope of R packages has increaseda hundredfold since. For specialist data formats it is worth searching to see if a suitable packagealready ImportsThe easiest form of data to import into R is a simple text file, and this will often be acceptable forproblems of small or medium scale. The primary function to import from a text file isscan, andthis underlies most of the more convenient functions discussed inChapter 2 [Spreadsheet-likedata], page , all statistical consultants are familiar with being presented by a client with amemory stick (formerly, a floppy disc or CD-R) of data in some proprietary binary format,for example an Excel spreadsheet or an SPSS file.

7 Often the simplest thing to do is to usethe originating application to export the data as a text file (and statistical consultants willhave copies of the most common applications on their computers for that purpose). However,this is not always possible, andChapter 3 [Importing from other statistical systems], page 12,discusses what facilities are available to access such files directly from R. For Excel spreadsheets,the available methods are summarized inChapter 9 [Reading Excel spreadsheets], page a few cases, data have been stored in a binary form for compactness and speed of application of this that we have seen several times is imaging data , which is normally storedas a stream of bytes as represented in memory, possibly preceded by a header.

8 Such data formatsare discussed inChapter 5 [Binary files], page 20,andSection [Binary connections], page much larger databases it is common to handle the data using a database managementsystem (DBMS). There is once again the option of using the DBMS to extract a plain file, butChapter 1: Introduction3for many such DBMSs the extraction operation can be done directly from an R package: SeeChapter 4 [Relational databases], page 14. Importing data via network connections is discussedinChapter 8 [Network interfaces], page EncodingsUnless the file to be imported from is entirely inASCII, it is usually necessary to know how itwas encoded. For text files, a good way to find out something about its structure is thefilecommand-line tool (for Windows, included inRtools).

9 This reports something : UTF-8 Unicode English : ISO-8859 English : Little-endian UTF-16 Unicode English character data ,with CRLF line : UTF-8 Unicode : UTF-8 Unicode (with BOM) textModern Unix-alike systems, including macOS, are likely to produce UTF-8 files. Windows mayproduce what it calls Unicode files (UCS-2 LEor just possiblyUTF-16LE1). Otherwise most fileswill be in a 8-bit encoding unless from a Chinese/Japanese/Korean locale (which have a widerange of encodings in common use). It is not possible to automatically detect with certaintywhich 8-bit encoding (although guesses may be possible andfilemay guess as it did in theexample above), so you may simply have to ask the originator for some clues ( Russian onWindows ). BOMs (Byte Order Marks, ) causeproblems for Unicode files.

10 In the Unix world BOMs are rarely used, whereas in the Windowsworld they almost always are for UCS-2/UTF-16 files, and often are for UTF-8 files. Thefileutility will not even recognize UCS-2 files without a BOM, but many other utilities will refuseto read files with a BOM and theIANA standards forUTF-16 LEandUTF-16 BEprohibit it. Wehave too often been reduced to looking at the file with the command-line utilityodor a hexeditor to work out its thatutf8is not a valid encoding name (UTF-8is), andmacintoshis the most portablename for what is sometimes called Mac Roman Export to text filesExporting results from R is usually a less contentious task, but there are still a number of will be a target application in mind, and often a text file will be the most convenientinterchange vehicle.


Related search queries