Nathalie Villa-Vialaneix - http://www.nathalievialaneix.eu
September 14-16th, 2015
Master TIDE, Université Paris 1
For import/export operations, R works with a working directory
getwd()
[1] "/home/nathalie/Private/Travail/Enseignements/masterTIDE"
that can be changed using
set.wd("~/Rlesson3") # not run
or using the menu Session/Set working directory in RStudio.
Kind of data that can be downloaded in R:
You can import file from:
fileURL <- "http://www.nathalievialaneix.eu/doc/csv/ex-data-tide.csv"
dir.create("data")
download.file(fileURL, destfile="data/ex-data.csv")
list.files("data")
[1] "ex-data.csv"
ls()
[1] "fileURL"
df <- read.table("data/ex-data.csv", sep=";", header=TRUE)
summary(df[,1:3])
annee age nee.france
Min. :2007 Min. :17.00 Non:10
1st Qu.:2008 1st Qu.:18.00 Oui:89
Median :2009 Median :19.00
Mean :2010 Mean :19.05
3rd Qu.:2010 3rd Qu.:20.00
Max. :2013 Max. :26.00
sep
: column separator character (default: white space)header
(TRUE
/FALSE
): are column names contained in the first
line? (default: FALSE
)dec
: decimal separator character (default: comma)quote
: quoting character (default: "
)row.names
: a number giving the column which contains the row names
(default: the file contains no column with row names)na.strings
: strings to be interpreted as NA
(default: blank
strings)stringsAsFactor
: strings are imported as
factors (TRUE
, default) or as characters (FALSE
)read.csv
(English standard format, comma separator) and read.csv2
(French standard format, semicolumn separator) can be used to import CSV file
df <- read.csv2("data/ex-data.csv", stringsAsFactor=FALSE)
summary(df[,4:5])
cp.naissance sexe
Min. : 6600 Length:99
1st Qu.:11000 Class :character
Median :33000 Mode :character
Mean :42042
3rd Qu.:69000
Max. :98714
NA's :14
Files can be read as strings (and processed inside R).
cur.conn <- url(fileURL) # open connexion
df2 <- readLines(cur.conn, n=3)
lapply(df2, substr, start=1, stop=15) # first 15 characters
[[1]]
[1] "\"annee\";\"age\";\""
[[2]]
[1] "2007;19;\"Oui\";7"
[[3]]
[1] "2007;19;\"Oui\";1"
close(cur.conn) # close connexion
write.table(df, file="data/export-data.txt")
write.csv2(df, file="data/export-data.csv",
row.names=FALSE)
with approximately the same options than read.table
.
If you want to save more complicated variables or several variables in a single file, you can use Rdata format:
data(iris); ls()
[1] "cur.conn" "df" "df2" "fileURL" "iris"
save(df, iris, file="data/export-ws.rda")
Rdata files are loaded with:
rm(list=ls()); ls()
character(0)
load("data/export-ws.rda"); ls()
[1] "df" "iris"
load("data/export-ws.rda", verbose=TRUE)
Loading objects:
df
iris
xlsx
packageXML
packagejsonlite
and RJSON
packagesRMySQL
packagerhdf5
(HDF5) and rhdfs
(hadoop) packagesUsing the file at http://www.nathalievialaneix.eu/doc/csv/co2.csv, import the corresponding data and answer the following questions:
What is the dimension of the data?
What are the variables included in the data? What are their types?
Make a contingency tables of the variables Type
and Treatment
.
What is the median uptable value for each plant Type?