For this workshop we will be discussing the dplyr and tidyr packages in R.
They are designed to help you to restructure, rearrange, aggregate or transform your data.
There are very nice online tutorials to help learn these packages further:
Monday, February 29, 2016
For this workshop we will be discussing the dplyr and tidyr packages in R.
They are designed to help you to restructure, rearrange, aggregate or transform your data.
There are very nice online tutorials to help learn these packages further:
#install.packages(c("dplyr","tidyr")) library(dplyr) library(tidyr)
baseball <- read.csv("http://kmaurer.github.io/documents/baseball.csv") head(baseball)
## id year stint team lg g ab r h X2b X3b hr rbi sb cs bb so ibb ## 1 ansonca01 1871 1 RC1 25 120 29 39 11 3 0 16 6 2 2 1 NA ## 2 forceda01 1871 1 WS3 32 162 45 45 9 4 0 29 8 0 4 0 NA ## 3 mathebo01 1871 1 FW1 19 89 15 24 3 1 0 10 2 1 2 0 NA ## 4 startjo01 1871 1 NY2 33 161 35 58 5 1 1 34 4 2 3 0 NA ## 5 suttoez01 1871 1 CL1 29 128 35 45 3 7 3 23 3 1 1 0 NA ## 6 whitede01 1871 1 CL1 29 146 40 47 6 5 1 21 2 2 4 1 NA ## hbp sh sf gidp ## 1 NA NA NA NA ## 2 NA NA NA NA ## 3 NA NA NA NA ## 4 NA NA NA NA ## 5 NA NA NA NA ## 6 NA NA NA NA
Leap into RStudio to build script from scratch!
data(french_fries, package="reshape2") head(french_fries)
## time treatment subject rep potato buttery grassy rancid painty ## 61 1 1 3 1 2.9 0.0 0.0 0.0 5.5 ## 25 1 1 3 2 14.0 0.0 0.0 1.1 0.0 ## 62 1 1 10 1 11.0 6.4 0.0 0.0 0.0 ## 26 1 1 10 2 9.9 5.9 2.9 2.2 0.0 ## 63 1 1 15 1 1.2 0.1 0.0 1.1 5.1 ## 27 1 1 15 2 8.8 3.0 3.6 1.5 2.3
Leap back over to RStudio to continue this example!
ggplot2 is a visualization package in R that supports many plot types and structures.
Built based on the idea of "the grammar of graphics"
#install.packages(c("ggplot2")) library(ggplot2)
The diamonds data set was included in when you loaded the ggplot2 library
head(diamonds)
## carat cut color clarity depth table price x y z ## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 ## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 ## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 ## 4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63 ## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 ## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
ggplot() + geom_point(aes(x=carat,y=price,color=cut), data=diamonds)
Back to RStudio once again for exploration!