**Getting started with R. **

**R** is a flexible and powerful open-source Language and has extensive statistical and graphing capabilities. Its syntax is very simple and intuitive. The large and fast-growing community around the R language has certainly contributed to its value as a programming language and as a data analysis environment.

**1) Install R and R studio (IDE) **

**2) Install packages:**

`install.packages("package name")`

### 3) Rstudio OverView

RStudio is the most popular R code editor, and it interfaces with R for Windows, MacOS, and Linux platforms.

- script pane– to write and save the programming script
- Console pane – where all the code will get executed
- Environment/history pane – displays all the variables created,functions used with in the current session
- Helper pane – contains multiple tabs to install/display packages, view visualization plots, locate files within the workspace

### 4) The Workspace

The workspace is your current R working environment and includes any user-defined objects (vectors, matrices, data frames, lists, functions)

### 5) Entering Commands

R is a command line driven program. The user enters commands at the prompt (**>** by default) and each command is executed one at a time.

**6) Data Types in R**

A vector is a variable in the commonly admitted meaning. A factor is a categorical variable. An array is a table with k dimensions, a matrix being a particular case of array with k = 2. Note that the elements of an array or of a matrix are all of the same mode. A data frame is a table composed with one or several vectors and/or factors all of the same length but possibly of different modes.

**7) variable assignment (<- or =) **

`variable <- 10 `

Extracting elements: this, [, can be used to extract content from vectors, lists, or data frames. and, [[ and $, extract content from a single object.

**8) Getting Help**

Once R is installed, there is a comprehensive built-in help system. At the program’s command prompt you can use any of the following:

`help("data.frame")`

`?data.frame`

`?getwd`

`?"$"`

### 9) Books

## Important Packages

### To load data

RMySQL, RPostgresSQL, RSQLite – to read in data from a database.

XLConnect, xlsx – to read and write Micorsoft Excel files from R.

foreign – to read a SAS/SPSS data set into R

R can handle plain text files – no package required. Just use the functions read.csv, read.table, and read.fwf.

### To manipulate data

dplyr – dplyr is a go to package for fast data manipulation.

tidyr – Tools for changing the layout of your data sets.

stringr – Easy to learn tools for regular expressions and character strings.

lubridate – Tools that make working with dates and times easier.

### To visualize data

ggplot2 – R’s famous package for making beautiful graphics.

ggvis – Interactive, web based graphics built with the grammar of graphics.

rgl – Interactive 3D visualizations with R

googleVis – Let’s you use Google Chart tools to visualize data in R.

### To model data

car – car’s Anova function is popular for making type II and type III Anova tables.

mgcv – Generalized Additive Models

lme4/nlme – Linear and Non-linear mixed effects models

randomForest – Random forest methods from machine learning

multcomp – Tools for multiple comparison testing

vcd – Visualization tools and tests for categorical data

glmnet – Lasso and elastic-net regression methods with cross validation

survival – Tools for survival analysis

caret – Tools for training regression and classification models

### To report results

shiny – Easily make interactive, web apps with R.

R Markdown – The perfect workflow for reproducible reporting.

### For Spatial data

sp, maptools – Tools for loading and using spatial data including shapefiles.

maps – Easy to use map polygons for plots.

ggmap – Download street maps straight from Google maps and use them as a background in your ggplots.

### For Time Series and Financial data

zoo – Provides the most popular format for saving time series objects in R.

xts – Very flexible tools for manipulating time series data sets.

quantmod – Tools for downloading financial data, plotting common charts, and doing technical analysis.

### To write high performance R code

Rcpp – Write R functions that call C++ code for lightning fast speed.

data.table – An alternative way to organize data sets for very, very fast operations.

parallel – Use parallel processing in R to speed up your code or to crunch large data sets.

### To work with the web

XML – Read and create XML documents with R

jsonlite – Read and create JSON data tables with R

httr – A set of useful tools for working with http connections

**Start your first data science case study in R here**

**Start your first data science case study in R here**