--- title: "Introduction to bulkreadr" output: rmarkdown::html_vignette description: > Start here if this is your first time using `bulkreadr`. You'll learn how to use functions like `read_excel_workbook()` and `read_excel_files_from_dir()` for importing data from Excel and `read_gsheets()` for Google Sheets, allowing for data importation from multiple sheets. For handling CSV files, `read_csv_files_from_dir()` reads all CSV files from a specified directory. author: "Ezekiel Ogundepo and Ernest Fokoué" vignette: > %\VignetteIndexEntry{Introduction to bulkreadr package} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, comment = "#>", fig.path = "man/figures/", out.width = "100%") options(tibble.print_min = 5, tibble.print_max = 5) options(rmarkdown.html_vignette.check_title = FALSE) ``` ## About the package `bulkreadr` is an R package designed to simplify and streamline the process of reading and processing large volumes of data. With a collection of functions tailored for bulk data operations, the package allows users to efficiently read multiple sheets from Microsoft Excel/Google Sheets workbooks and multiple CSV files from a directory. It returns the data as organized data frames, making it convenient for further analysis and manipulation. Whether dealing with extensive data sets or batch processing tasks, "bulkreadr" empowers users to effortlessly handle data in bulk, saving time and effort in data preparation workflows. ## Installation You can install `bulkreadr` package from [CRAN](https://cran.r-project.org/) with: ```{r eval=FALSE} install.packages("bulkreadr") ``` or the development version from [GitHub](https://github.com/gbganalyst/bulkreadr) with ```{r eval=FALSE} if(!require("devtools")){ install.packages("devtools") } devtools::install_github("gbganalyst/bulkreadr") ``` ## How to load the package Now that you have installed `bulkreadr` package, you can simply load it by using: ```{r pkgload} library(bulkreadr) ``` ## Functions in bulkreadr package This section provides a concise overview of the different functions available in the `bulkreadr` package for importing bulk data in R. ## Note > For the majority of functions within this package, we will utilize data stored in the system file by the `bulkreadr`, which can be accessed using the `system.file()` function. If you wish to utilize your own data stored in your local directory, please ensure that you have set the appropriate file path prior to using any functions provided by the bulkreadr package. ## read_excel_workbook() `read_excel_workbook()` reads all the data from the sheets of an Excel workbook and return an appended dataframe. ```{r example1} # path to the xls/xlsx file. path <- system.file("extdata", "Diamonds.xlsx", package = "bulkreadr") # read the sheets read_excel_workbook(path = path) ``` ## read_excel_files_from_dir() `read_excel_files_from_dir()` reads all Excel workbooks in the `"~/data"` directory and returns an appended dataframe. ```{r example1a} # path to the directory containing the xls/xlsx files. directory <- system.file("xlsxfolder", package = "bulkreadr") # import the workbooks read_excel_files_from_dir(dir_path = directory) ``` ## read_csv_files_from_dir() `read_csv_files_from_dir()` reads all csv files from the `"~/data"` directory and returns an appended dataframe. The resulting dataframe will be in the same order as the CSV files in the directory. ```{r example2} # path to the directory containing the CSV files. directory <- system.file("csvfolder", package = "bulkreadr") # import the csv files read_csv_files_from_dir(dir_path = directory) ``` ## read_gsheets() The `read_gsheets()` function imports data from multiple sheets in a Google Sheets spreadsheet and appends the resulting dataframes from each sheet together to create a single dataframe. This function is a powerful tool for data analysis, as it allows you to easily combine data from multiple sheets into a single dataset. ```{r, include=FALSE} googlesheets4::gs4_deauth() ``` ```{r example3, message=FALSE, warning=FALSE} # Google Sheet ID or the link to the sheet sheet_id <- "1izO0mHu3L9AMySQUXGDn9GPs1n-VwGFSEoAKGhqVQh0" # read all the sheets read_gsheets(ss = sheet_id) ```