R is a programming language and software environment that is widely used for statistical computing and graphics. It was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand in the mid-1990s. R is free and open-source software, which means that anyone can download, use, and modify it.

R has become an essential tool for data analysis and statistical modeling, and is widely used in academia, government, and industry. Its popularity is due in part to its powerful capabilities for data visualization and manipulation, as well as its ability to interface with other programming languages and software.

To help you get started with R, we’ve put together a cheat sheet that covers many of the basic functions and packages you’ll need to know. The cheat sheet is divided into several sections, including data types, data structures, data input and output, data manipulation, graphics, and other useful packages.

### Cheat Sheet

#### Basics

Function/Operator | Description |

# | Comment |

print(x) or cat(x) | Print a variable |

c(x1, x2, …) | Combine elements into a vector |

length(x) | Get the length of a vector |

seq(from, to, by) | Generate a sequence of numbers |

rep(x, times) | Repeat elements of a vector |

sort(x) | Sort a vector in ascending order |

rev(x) | Reverse a vector |

unique(x) | Get unique elements of a vector |

which(x) | Get the indices of TRUE values |

sum(x) | Sum the elements of a vector |

mean(x) | Calculate the mean of a vector |

sd(x) | Calculate the standard deviation of a vector |

var(x) | Calculate the variance of a vector |

max(x) | Get the maximum value of a vector |

min(x) | Get the minimum value of a vector |

range(x) | Get the range of a vector |

quantile(x, probs) | Calculate quantiles of a vector |

sample(x, size, replace) | Randomly sample from a vector |

runif(n, min, max) | Generate n random numbers between min and max |

set.seed(x) | Set the seed for reproducibility |

#### Data Structures

Function/Operator | Description |

vector() | Create an empty vector |

list() | Create an empty list |

matrix(nrow, ncol) | Create an empty matrix |

array(dim) | Create an empty array |

data.frame() | Create an empty data frame |

cbind(x, y) | Combine two vectors by column |

rbind(x, y) | Combine two vectors by row |

names(x) or colnames(x) or rownames(x) | Get or set the names of a vector, matrix or data frame |

dim(x) | Get or set the dimensions of a matrix or array |

nrow(x) | Get the number of rows of a matrix or data frame |

ncol(x) | Get the number of columns of a matrix or data frame |

length(x) | Get the length of a vector, list or data frame |

str(x) | Display the structure of an object |

#### Data Input/Output

Function/Operator | Description |

read.csv(“file.csv”) | Read a CSV file |

read.table(“file.txt”) | Read a tab-delimited file |

readLines(“file.txt”) | Read the lines of a text file |

write.csv(x, “file.csv”) | Write a data frame to a CSV file |

write.table(x, “file.txt”) | Write a matrix or data frame to a text file |

#### Data Manipulation

Function/Operator | Description |

subset(x, subset) | Extract a subset of a data frame |

select(x, col1, col2, …) | Select columns of a data frame |

mutate(x, newcol = f(col)) | Create a new column in a data frame |

arrange(x, col1, col2, …) | Sort a data frame by column |

filter(x, condition) | |

group_by(x, col) | Group a data frame by one or more columns |

summarize(x, newcol = f(col)) | Summarize a data frame by group |

join(x, y, by, type) | Join two data frames |

merge(x, y, by, type) | Merge two data frames |

reshape(x, idvar, timevar, direction) | Reshape a data frame from wide to long or vice versa |

melt(x, id.vars, measure.vars) | Reshape a data frame from wide to long |

dcast(x, formula) | Reshape a data frame from long to wide |

tidyr::pivot_longer() | Reshape a data frame from wide to long |

tidyr::pivot_wider() | Reshape a data frame from long to wide |

aggregate(x, by, FUN) | Compute summary statistics for each group |

tapply(x, INDEX, FUN) | Apply a function to subsets of a vector |

sapply(x, FUN) | Apply a function to each element of a vector or list |

lapply(x, FUN) | Apply a function to each element of a list |

mapply(FUN, …) | Apply a function to multiple vectors or lists |

%>% | Pipe operator for chaining multiple functions |

#### Control Structures

Function/Operator | Description |

if (condition) {expr1} else {expr2} | If-else statement |

for (i in seq_along(x)) {expr} | For loop |

while (condition) {expr} | While loop |

repeat {expr} | Repeat loop |

break | Exit a loop |

next | Skip an iteration in a loop |

ifelse(condition, expr1, expr2) | Vectorized if-else statement |

switch(expr, case1, case2, …) | Switch statement |

tryCatch(expr, error = function(e) {expr2}) | Handle exceptions |

#### Functions

Function/Operator | Description |

function(arg1, arg2, …) {expr} | Define a function |

return(x) | Return a value from a function |

formals(fun) | Get the formal arguments of a function |

body(fun) | Get the body of a function |

environment(fun) | Get the environment of a function |

source(“file.R”) | Source a file with function definitions |

#### Graphics

Function/Operator | Description |

plot(x, y) | Create a scatter plot |

hist(x) | Create a histogram |

barplot(x) | Create a bar plot |

boxplot(x) | Create a box plot |

pie(x) | Create a pie chart |

plot(x ~ y) | Create a scatter plot with formula syntax |

plot(x, type = “l”) | Create a line plot |

abline(a, b) | Add a line to a plot |

points(x, y) | Add points to a plot |

lines(x, y) | Add lines to a plot |

text(x, y, labels) | Add text to a plot |

legend(x, y, legend) | Add a legend to a |

par(mfrow = c(nrows, ncols)) | Set up a multi-panel plot |

layout(matrix(1:nplots, ncol = ncols)) | Set up a multi-panel plot with custom arrangement |

plot.new() | Create a new plot window |

dev.off() | Close the current plot window |

plot(x, y, col) | Set the color of points in a plot |

plot(x, y, pch) | Set the shape of points in a plot |

plot(x, y, lty) | Set the line type in a plot |

plot(x, y, lwd) | Set the line width in a plot |

title(main = “Main Title”, sub = “Subtitle”) | Add a title and subtitle to a plot |

xlab(“x-axis label”) | Add an x-axis label to a plot |

ylab(“y-axis label”) | Add a y-axis label to a plot |

axis(side, at, labels) | Add an axis to a plot |

legend(x, y, legend, …) | Add a legend to a plot |

ggplot(data, aes(x, y)) + geom_*(…) + … | Create a plot using the ggplot2 package |

#### Statistics

Function/Operator | Description |

mean(x) | Compute the mean of a vector |

median(x) | Compute the median of a vector |

sd(x) | Compute the standard deviation of a vector |

var(x) | Compute the variance of a vector |

cor(x, y) | Compute the correlation between two vectors |

cov(x, y) | Compute the covariance between two vectors |

t.test(x, y) | Perform a t-test |

chisq.test(x, y) | Perform a chi-squared test |

anova(fit) | Perform an analysis of variance |

lm(y ~ x, data) | Fit a linear regression model |

glm(y ~ x, data, family) | Fit a generalized linear model |

aov(y ~ x, data) | Fit an analysis of variance model |

summary(fit) | Summarize the results of a model fit |

#### Machine Learning

Function/Operator | Description |

caret::train(formula, data, method, trControl) | Train a machine learning model using the caret package |

caret::predict(object, newdata) | Use a trained model to make predictions |

glmnet::cv.glmnet(x, y, alpha, lambda) | Fit a regularized linear model using cross-validation |

randomForest::randomForest(x, y) | Fit a random forest model |

xgboost::xgboost(data, label) | Fit an extreme gradient boosting model |

keras::fit(model, x, y) | Fit a neural network model using the keras package |

keras::predict(model, x) | Use a trained neural network to make predictions |

tidymodels::fit(model, data) | Fit a machine learning model using the tidymodels package |

tidymodels::predict(object, new_data) | Use a trained model to make predictions |

#### Other Useful Packages

Package | Description |

tidyr | A package for data tidying and reshaping |

ggplot2 | A package for creating graphics using the grammar of graphics |

stringr | A package for working with strings |

lubridate | A package for working with dates and times |

readr | A package for reading rectangular data (like CSV files) |

readxl | A package for reading Excel files |

magrittr | A package for writing more readable code using the pipe operator %>% |

data.table | A package for working with large data sets |

caret | A package for training and evaluating machine learning models |

tidymodels | A package for building and evaluating machine learning models using a tidy approach |

purrr | A package for working with functions and vectors |

plyr | A package for working with data frames and lists |

reshape2 | A package for data reshaping and melting |

forcats | A package for working with categorical variables |

tibble | A package for working with data frames |

caretEnsemble | A package for creating ensembles of machine learning models |

tidyverse | A collection of packages (including dplyr, tidyr, ggplot2, and others) for data manipulation and visualization |

shiny | A package for creating interactive web applications |

knitr | A package for creating dynamic reports and documents |

rmarkdown | A package for creating dynamic documents that can include text, code, and graphics |

devtools | A package for developing and sharing R packages |

roxygen2 | A package for creating documentation for R packages |

testthat | A package for testing R code |

profvis | A package for profiling R code |

ggthemes | A package for customizing the appearance of ggplot2 graphics |

ggpubr | A package for creating publication-ready graphics |

plotly | A package for creating interactive graphics |

leaflet | A package for creating interactive maps |

shinydashboard | A package for creating dashboards using Shiny |

shinythemes | A package for customizing the appearance of Shiny apps |

Reference: