R is a programming language and software environment that is widely used for statistical computing and graphics. It was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand in the mid-1990s. R is free and open-source software, which means that anyone can download, use, and modify it.
R has become an essential tool for data analysis and statistical modeling, and is widely used in academia, government, and industry. Its popularity is due in part to its powerful capabilities for data visualization and manipulation, as well as its ability to interface with other programming languages and software.
To help you get started with R, we’ve put together a cheat sheet that covers many of the basic functions and packages you’ll need to know. The cheat sheet is divided into several sections, including data types, data structures, data input and output, data manipulation, graphics, and other useful packages.
Cheat Sheet
Basics
Function/Operator
Description
#
Comment
print(x) or cat(x)
Print a variable
c(x1, x2, …)
Combine elements into a vector
length(x)
Get the length of a vector
seq(from, to, by)
Generate a sequence of numbers
rep(x, times)
Repeat elements of a vector
sort(x)
Sort a vector in ascending order
rev(x)
Reverse a vector
unique(x)
Get unique elements of a vector
which(x)
Get the indices of TRUE values
sum(x)
Sum the elements of a vector
mean(x)
Calculate the mean of a vector
sd(x)
Calculate the standard deviation of a vector
var(x)
Calculate the variance of a vector
max(x)
Get the maximum value of a vector
min(x)
Get the minimum value of a vector
range(x)
Get the range of a vector
quantile(x, probs)
Calculate quantiles of a vector
sample(x, size, replace)
Randomly sample from a vector
runif(n, min, max)
Generate n random numbers between min and max
set.seed(x)
Set the seed for reproducibility
Data Structures
Function/Operator
Description
vector()
Create an empty vector
list()
Create an empty list
matrix(nrow, ncol)
Create an empty matrix
array(dim)
Create an empty array
data.frame()
Create an empty data frame
cbind(x, y)
Combine two vectors by column
rbind(x, y)
Combine two vectors by row
names(x) or colnames(x) or rownames(x)
Get or set the names of a vector, matrix or data frame
dim(x)
Get or set the dimensions of a matrix or array
nrow(x)
Get the number of rows of a matrix or data frame
ncol(x)
Get the number of columns of a matrix or data frame
length(x)
Get the length of a vector, list or data frame
str(x)
Display the structure of an object
Data Input/Output
Function/Operator
Description
read.csv(“file.csv”)
Read a CSV file
read.table(“file.txt”)
Read a tab-delimited file
readLines(“file.txt”)
Read the lines of a text file
write.csv(x, “file.csv”)
Write a data frame to a CSV file
write.table(x, “file.txt”)
Write a matrix or data frame to a text file
Data Manipulation
Function/Operator
Description
subset(x, subset)
Extract a subset of a data frame
select(x, col1, col2, …)
Select columns of a data frame
mutate(x, newcol = f(col))
Create a new column in a data frame
arrange(x, col1, col2, …)
Sort a data frame by column
filter(x, condition)
group_by(x, col)
Group a data frame by one or more columns
summarize(x, newcol = f(col))
Summarize a data frame by group
join(x, y, by, type)
Join two data frames
merge(x, y, by, type)
Merge two data frames
reshape(x, idvar, timevar, direction)
Reshape a data frame from wide to long or vice versa
melt(x, id.vars, measure.vars)
Reshape a data frame from wide to long
dcast(x, formula)
Reshape a data frame from long to wide
tidyr::pivot_longer()
Reshape a data frame from wide to long
tidyr::pivot_wider()
Reshape a data frame from long to wide
aggregate(x, by, FUN)
Compute summary statistics for each group
tapply(x, INDEX, FUN)
Apply a function to subsets of a vector
sapply(x, FUN)
Apply a function to each element of a vector or list
lapply(x, FUN)
Apply a function to each element of a list
mapply(FUN, …)
Apply a function to multiple vectors or lists
%>%
Pipe operator for chaining multiple functions
Control Structures
Function/Operator
Description
if (condition) {expr1} else {expr2}
If-else statement
for (i in seq_along(x)) {expr}
For loop
while (condition) {expr}
While loop
repeat {expr}
Repeat loop
break
Exit a loop
next
Skip an iteration in a loop
ifelse(condition, expr1, expr2)
Vectorized if-else statement
switch(expr, case1, case2, …)
Switch statement
tryCatch(expr, error = function(e) {expr2})
Handle exceptions
Functions
Function/Operator
Description
function(arg1, arg2, …) {expr}
Define a function
return(x)
Return a value from a function
formals(fun)
Get the formal arguments of a function
body(fun)
Get the body of a function
environment(fun)
Get the environment of a function
source(“file.R”)
Source a file with function definitions
Graphics
Function/Operator
Description
plot(x, y)
Create a scatter plot
hist(x)
Create a histogram
barplot(x)
Create a bar plot
boxplot(x)
Create a box plot
pie(x)
Create a pie chart
plot(x ~ y)
Create a scatter plot with formula syntax
plot(x, type = “l”)
Create a line plot
abline(a, b)
Add a line to a plot
points(x, y)
Add points to a plot
lines(x, y)
Add lines to a plot
text(x, y, labels)
Add text to a plot
legend(x, y, legend)
Add a legend to a
par(mfrow = c(nrows, ncols))
Set up a multi-panel plot
layout(matrix(1:nplots, ncol = ncols))
Set up a multi-panel plot with custom arrangement
plot.new()
Create a new plot window
dev.off()
Close the current plot window
plot(x, y, col)
Set the color of points in a plot
plot(x, y, pch)
Set the shape of points in a plot
plot(x, y, lty)
Set the line type in a plot
plot(x, y, lwd)
Set the line width in a plot
title(main = “Main Title”, sub = “Subtitle”)
Add a title and subtitle to a plot
xlab(“x-axis label”)
Add an x-axis label to a plot
ylab(“y-axis label”)
Add a y-axis label to a plot
axis(side, at, labels)
Add an axis to a plot
legend(x, y, legend, …)
Add a legend to a plot
ggplot(data, aes(x, y)) + geom_*(…) + …
Create a plot using the ggplot2 package
Statistics
Function/Operator
Description
mean(x)
Compute the mean of a vector
median(x)
Compute the median of a vector
sd(x)
Compute the standard deviation of a vector
var(x)
Compute the variance of a vector
cor(x, y)
Compute the correlation between two vectors
cov(x, y)
Compute the covariance between two vectors
t.test(x, y)
Perform a t-test
chisq.test(x, y)
Perform a chi-squared test
anova(fit)
Perform an analysis of variance
lm(y ~ x, data)
Fit a linear regression model
glm(y ~ x, data, family)
Fit a generalized linear model
aov(y ~ x, data)
Fit an analysis of variance model
summary(fit)
Summarize the results of a model fit
Machine Learning
Function/Operator
Description
caret::train(formula, data, method, trControl)
Train a machine learning model using the caret package
caret::predict(object, newdata)
Use a trained model to make predictions
glmnet::cv.glmnet(x, y, alpha, lambda)
Fit a regularized linear model using cross-validation
randomForest::randomForest(x, y)
Fit a random forest model
xgboost::xgboost(data, label)
Fit an extreme gradient boosting model
keras::fit(model, x, y)
Fit a neural network model using the keras package
keras::predict(model, x)
Use a trained neural network to make predictions
tidymodels::fit(model, data)
Fit a machine learning model using the tidymodels package
tidymodels::predict(object, new_data)
Use a trained model to make predictions
Other Useful Packages
Package
Description
tidyr
A package for data tidying and reshaping
ggplot2
A package for creating graphics using the grammar of graphics
stringr
A package for working with strings
lubridate
A package for working with dates and times
readr
A package for reading rectangular data (like CSV files)
readxl
A package for reading Excel files
magrittr
A package for writing more readable code using the pipe operator %>%
data.table
A package for working with large data sets
caret
A package for training and evaluating machine learning models
tidymodels
A package for building and evaluating machine learning models using a tidy approach
purrr
A package for working with functions and vectors
plyr
A package for working with data frames and lists
reshape2
A package for data reshaping and melting
forcats
A package for working with categorical variables
tibble
A package for working with data frames
caretEnsemble
A package for creating ensembles of machine learning models
tidyverse
A collection of packages (including dplyr, tidyr, ggplot2, and others) for data manipulation and visualization
shiny
A package for creating interactive web applications
knitr
A package for creating dynamic reports and documents
rmarkdown
A package for creating dynamic documents that can include text, code, and graphics
devtools
A package for developing and sharing R packages
roxygen2
A package for creating documentation for R packages
testthat
A package for testing R code
profvis
A package for profiling R code
ggthemes
A package for customizing the appearance of ggplot2 graphics
ggpubr
A package for creating publication-ready graphics
plotly
A package for creating interactive graphics
leaflet
A package for creating interactive maps
shinydashboard
A package for creating dashboards using Shiny
shinythemes
A package for customizing the appearance of Shiny apps