Nathalie Villa-Vialaneix - http://www.nathalievialaneix.eu
September 14-16th, 2015
Master TIDE, Université Paris 1
For import/export operations, R works with a working directory
getwd()
[1] "/home/nathalie/Private/Travail/Enseignements/masterTIDE"
that can be changed using
set.wd("~/Rlesson3") # not run
or using the menu Session/Set working directory in RStudio.
Kind of data that can be downloaded in R:
You can import file from:
fileURL <- "http://www.nathalievialaneix.eu/doc/csv/ex-data-tide.csv"
dir.create("data")
download.file(fileURL, destfile="data/ex-data.csv")
list.files("data")
[1] "ex-data.csv"
ls()
[1] "fileURL"
df <- read.table("data/ex-data.csv", sep=";", header=TRUE)
summary(df[,1:3])
annee age nee.france
Min. :2007 Min. :17.00 Non:10
1st Qu.:2008 1st Qu.:18.00 Oui:89
Median :2009 Median :19.00
Mean :2010 Mean :19.05
3rd Qu.:2010 3rd Qu.:20.00
Max. :2013 Max. :26.00
sep: column separator character (default: white space)header (TRUE/FALSE): are column names contained in the first
line? (default: FALSE)dec: decimal separator character (default: comma)quote: quoting character (default: ")row.names: a number giving the column which contains the row names
(default: the file contains no column with row names)na.strings: strings to be interpreted as NA (default: blank
strings)stringsAsFactor: strings are imported as
factors (TRUE, default) or as characters (FALSE)read.csv (English standard format, comma separator) and read.csv2
(French standard format, semicolumn separator) can be used to import CSV file
df <- read.csv2("data/ex-data.csv", stringsAsFactor=FALSE)
summary(df[,4:5])
cp.naissance sexe
Min. : 6600 Length:99
1st Qu.:11000 Class :character
Median :33000 Mode :character
Mean :42042
3rd Qu.:69000
Max. :98714
NA's :14
Files can be read as strings (and processed inside R).
cur.conn <- url(fileURL) # open connexion
df2 <- readLines(cur.conn, n=3)
lapply(df2, substr, start=1, stop=15) # first 15 characters
[[1]]
[1] "\"annee\";\"age\";\""
[[2]]
[1] "2007;19;\"Oui\";7"
[[3]]
[1] "2007;19;\"Oui\";1"
close(cur.conn) # close connexion
write.table(df, file="data/export-data.txt")
write.csv2(df, file="data/export-data.csv",
row.names=FALSE)
with approximately the same options than read.table.
If you want to save more complicated variables or several variables in a single file, you can use Rdata format:
data(iris); ls()
[1] "cur.conn" "df" "df2" "fileURL" "iris"
save(df, iris, file="data/export-ws.rda")
Rdata files are loaded with:
rm(list=ls()); ls()
character(0)
load("data/export-ws.rda"); ls()
[1] "df" "iris"
load("data/export-ws.rda", verbose=TRUE)
Loading objects:
df
iris
xlsx packageXML packagejsonlite and RJSON packagesRMySQL packagerhdf5 (HDF5) and rhdfs (hadoop) packagesUsing the file at http://www.nathalievialaneix.eu/doc/csv/co2.csv, import the corresponding data and answer the following questions:
What is the dimension of the data?
What are the variables included in the data? What are their types?
Make a contingency tables of the variables Type and Treatment.
What is the median uptable value for each plant Type?