Heey peeps! Let me start by introducing Tyidyverse. Tidyverse is a collection of R packages designed to make data manipulation and visualization easier and more efficient. It was developed by Hadley Wickham, a prominent figure in the R community, and is widely used by data scientists and analysts.
The tidyverse packages are built around a common philosophy of data manipulation, which emphasizes the use of tidy data principles. Tidy data is a structured format for data that makes it easier to work with, analyze, and visualize. It involves organizing data into rows and columns, with each variable in a separate column and each observation in a separate row.
The core packages in the tidyverse include dplyr, tidyr, ggplot2, readr, and purrr. These packages provide a wide range of functions for data manipulation, including filtering, sorting, grouping, and summarizing data. They also provide tools for reshaping data, handling missing values, and working with dates and times.
In addition to these core packages, the tidyverse includes several other packages that provide specialized functionality. For example, the stringr package provides functions for working with strings, while the lubridate package provides functions for working with dates and times.
This cheat sheet provides an overview of the most commonly used functions in the Tidyverse package. Enjoy and let me know if you have any further questions!
Data Manipulation
Function | Description |
---|---|
filter() | Select rows based on conditions |
select() | Select columns by name |
mutate() | Create new columns |
arrange() | Sort rows by variables |
group_by() | Group data by variables |
summarize() | Summarize data by groups |
distinct() | Remove duplicate rows |
slice() | Select rows by position |
rename() | Rename columns |
transmute() | Create new columns and drop existing ones |
Data Visualization
Function | Description |
---|---|
ggplot() | Create a new ggplot object |
aes() | Define aesthetic mappings |
geom_*() | Add geometric objects to a plot |
facet_*() | Create small multiples |
scale_*() | Adjust scales and legends |
theme_*() | Customize plot appearance |
labs() | Add plot labels |
coord_*() | Adjust coordinate systems |
annotate() | Add annotations to a plot |
Data Import and Export
Function | Description |
---|---|
read_csv() | Read a CSV file |
read_excel() | Read an Excel file |
read_table() | Read a delimited file |
readr::read_*() | Read various file formats |
write_csv() | Write a CSV file |
write_excel() | Write an Excel file |
write_table() | Write a delimited file |
readxl::write_*() | Write various file formats |
String Manipulation
Function | Description |
---|---|
str_detect() | Check if a string contains a pattern |
str_replace() | Replace a pattern in a string |
str_split() | Split a string into pieces |
str_sub() | Extract a substring |
str_trim() | Remove leading and trailing whitespace |
str_to_lower() | Convert a string to lowercase |
str_to_upper() | Convert a string to uppercase |
Data Types
Function | Description |
---|---|
as.character() | Convert to character |
as.numeric() | Convert to numeric |
as.integer() | Convert to integer |
as.factor() | Convert to factor |
as.Date() | Convert to date |
as.POSIXct() | Convert to POSIXct |
as.list() | Convert to list |
as.data.frame() | Convert to data frame |
References
Tidyverse documentation: https://www.tidyverse.org/