R tutorials, read write excel sheet, tables, text, csv files

Biostatistics with R

Read and write table data in R

The data used for statistical analysis is generally written in a tablular format and saved in one of the formats like excel sheet (.xls file extension), csv file (comma separated) or text file (tab or space separated). These files can be read into R as data frames using library functions.

Reading an excel table into R

Consider the following excel file called "model_data.xls", whose screen shot is shown below:

We can read this into R as a data frame using read.xls() function from gdata library.

To use this function, the install the gdata library from R prompt with the command,
> install.packages("gdata")
This is an one time installation.

A call to the read.xls() function with file name as parameter reads the excel sheet at a specific page and returns the whole data as a data frame in R. See the code below:



## Reading data from excel sheet.

library(gdata)

dat = read.xls(xls="model_data.xls", sheet=1, blank.lines.skip=TRUE)

print(names(dat))

dat$sample1

In the above call, sheet=1 refers to the first sheet, and blank.lines.skip=TRUE skips empty rows in the data.

After this, 'dat' is treated like any other data frame in R. See help(read.xls) to learn about more features of this function.

Reading table data from a csv file

For reading data from a file with comma separated data, we use read.csc() function in R.

We will now read the following table data stored in a csv file called "model_data.csv" using "read.csv()" function.


index,name,sample1,sample2,sample3,sample4,sample5
1,XXA-1,22.5,55.6,11.7,33.5,112.2
2,SSA-3,29.6,57.8,12.5,32.6,113.4
3,TTR-2,19.2,49.3,14.3,30.2,145.3
4,UUY-4,13.5,39.6,10.5,31.5,122.4
5,PEW-2,20.4,39.7,13.5,33.6,154.6
6,AYE-4,25.3,63.5,16.4,35.7,123.4
7,PPR-3,18.7,42.5,17.4,37.4,345.3
8,QSP-5,22.2,83.4,12.4,39.3,453.3
9,UGT-2,19.9,24.6,11.6,29.8,234.4
10,RRS-7,28.9,77.5,10.2,33.6,199.9

The code below shows how to read the above file into a data frame:


# Reading data from csv file
dat = read.csv(file="model_data.csv", header=TRUE)

print(names(dat))

dat$sample1

mean(dat$sample1)

The option header=TRUE tells the function that the first row of the data should be treated as column names.

Reading data table from a tab or space separated text file

The data from tab or space separated text file can be read into a data frame using read.table() function of R. We will now read the following space separated data in a file called "model_data.txt" :


index    name    sample1    sample2    sample3    sample4    sample5
1    XXA-1    22.5    55.6    11.7    33.5    112.2
2    SSA-3    29.6    57.8    12.5    32.6    113.4
3    TTR-2    19.2    49.3    14.3    30.2    145.3
4    UUY-4    13.5    39.6    10.5    31.5    122.4
5    PEW-2    20.4    39.7    13.5    33.6    154.6
6    AYE-4    25.3    63.5    16.4    35.7    123.4
7    PPR-3    18.7    42.5    17.4    37.4    345.3
8    QSP-5    22.2    83.4    12.4    39.3    453.3
9    UGT-2    19.9    24.6    11.6    29.8    234.4
10   RRS-7    28.9    77.5    10.2    33.6    199.9

The code below is self explanatory. We can also use an additional parameter sep="\t" for reading tab separated files. The function can also detect space or tab separator without 'sep parameter.


## Reading data from blank or tab separated text file
dat = read.table(file="model_data.txt", header=TRUE)

print(names(dat))

mean(dat$sample1)

Writing a data frame into a text file

To write a data frame in R into a text file, we use write.table() function. We have to mention the separator specifically using sep variable, and the variable row.names=FALSE avoids row numbering in the written text file.

We will create a data frame and write it into a file. Code is here:


## Reading data from blank or tab separated text file
dat = read.table(file="model_data.txt", header=TRUE)

print(names(dat))

mean(dat$sample1)


## Writing a data frame into text file
names = c("aaa","bbb","ccc","ddd","eee","fff")
col1 = c(1121, 1344, 2354, 4676, 2256, 4534)
col2 = c(122.5, 324.6, 456.3, 457.3, 845.1, 325.8)
col3 = c(45, 56, 67, 75, 34, 89)

datframe = data.frame(names, col1, col2, col3)

## writes the data frame into a tabs separated text file, without row numbering
write.table(datframe, file="model_table.txt", sep="\t", row.names=FALSE)

Now if we edit the file "model_data.txt", it will contain the following data:


"names"	"col1"	"col2"	"col3"
"aaa"	1121	122.5	45
"bbb"	1344	324.6	56
"ccc"	2354	456.3	67
"ddd"	4676	457.3	75
"eee"	2256	845.1	34
"fff"	4534	325.8	89