Title: | String Manipulation Package for Those Familiar with 'Microsoft Excel' |
---|---|
Description: | The goal of 'forstringr' is to enable complex string manipulation in R especially to those more familiar with LEFT(), RIGHT(), and MID() functions in Microsoft Excel. The package combines the power of 'stringr' with other manipulation packages such as 'dplyr' and 'tidyr'. |
Authors: | Ezekiel Ogundepo [aut, cre] , Olubukunola Oyedele [ctb], Fatimo Adebanjo [ctb] |
Maintainer: | Ezekiel Ogundepo <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2025-01-08 04:58:20 UTC |
Source: | https://github.com/gbganalyst/forstringr |
This survey data was collected using a Google form to demonstrate how the str_rm_whitespace_df()
function in the forstringr package could be used to eliminate whitespace.
community_data
community_data
A data frame with 32
rows and 8
variables:
Form submission date
First name of the respondent
The gender of the respondent
State or province living
Whether or not the respondent has a degree
The year of graduation from a college
Whether respondent used R for data science or not
The data science community the respondent is associated with
Ezekiel and Esther developed the Google form that was used to collect the data. By clicking the following link, you may also add to the data:
https://docs.google.com/forms/d/e/1FAIpQLSeAhIBaze-pTHghyIKDZEx5kDuke0oYv0YPqg4gtGKijHSaUg/viewform
length_omitna()
counts only non-missing elements of a vector.
length_omit_na(x)
length_omit_na(x)
x |
Input vector. Either a vector, or something coercible to one. |
An integer
length()
counts all the elements in a vector including those that are missing (NAs).
ethnicity <- c("Hausa", NA, "Yoruba", "Igbo", NA, "Fulani", "Kanuri", "Others") length_omit_na(ethnicity) length(ethnicity)
ethnicity <- c("Hausa", NA, "Yoruba", "Igbo", NA, "Fulani", "Kanuri", "Others") length_omit_na(ethnicity) length(ethnicity)
A dataset containing the list of top ten billionaires in Nigeria.
richest_in_nigeria
richest_in_nigeria
A data frame with 10
rows and 5
variables:
rank from 1 to 10
full name of the billionaires
net worth in billion dollars
the current age of billionaires
the origin of the billionaires' entire body of wealth
https://rnn.ng/richest-men-in-nigeria/
str_englue()
helps you solve the labeling problem during plotting. For example, any value wrapped in { }
will be inserted into the string and it can also understands embracing, {{ }}
, which automatically inserts a given variable name.
str_englue(x, env, error_call, error_arg)
str_englue(x, env, error_call, error_arg)
x |
A string to interpolate with glue operators. |
env |
User environment where the interpolation data lives in
case you're wrapping |
error_call |
The execution environment of a currently
running function, e.g. |
error_arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
library(ggplot2) histogram_plot <- function(df, var, binwidth) { ggplot(df, aes(x = {{ var }})) + geom_histogram(binwidth = binwidth) + labs(title = str_englue("A histogram of {{var}} with binwidth {binwidth}")) } histogram_plot(iris, Sepal.Length, binwidth = 0.1)
library(ggplot2) histogram_plot <- function(df, var, binwidth) { ggplot(df, aes(x = {{ var }})) + geom_histogram(binwidth = binwidth) + labs(title = str_englue("A histogram of {{var}} with binwidth {binwidth}")) } histogram_plot(iris, Sepal.Length, binwidth = 0.1)
Vectorised over string and pattern.
str_extract_part(string, pattern, before = TRUE)
str_extract_part(string, pattern, before = TRUE)
string |
A character vector. |
pattern |
Pattern to look for. |
before |
The position in the string to extract from. If TRUE, the extract will occur before the pattern; if FALSE, it will happen after the pattern. |
A subset of the input vector.
str_split_extract()
which splits up a string into pieces and extracts the results using a specified index position.
weekdays <- c( "Monday_1", "Tuesday_2", "Wednesday_3", "Thursday_4", "Friday_5", "Saturday_6", "Sunday_7" ) str_extract_part(weekdays, before = TRUE, pattern = "_") str_extract_part(c("$159", "$587", "$897"), before = FALSE, pattern = "$")
weekdays <- c( "Monday_1", "Tuesday_2", "Wednesday_3", "Thursday_4", "Friday_5", "Saturday_6", "Sunday_7" ) str_extract_part(weekdays, before = TRUE, pattern = "_") str_extract_part(c("$159", "$587", "$897"), before = FALSE, pattern = "$")
Given a character vector, str_left()
returns the left side of a string.
str_left(string, n = 1)
str_left(string, n = 1)
string |
The character from which the left portion will be returned. |
n |
Optional. The number of characters to return from the left side of string |
A character vector
str_right()
which extracts characters from the right and str_mid()
which returns a segment of character strings.
str_left("Nigeria") str_left("Nigeria", n = 3) str_left(c("Female", "Male", "Male", "Female"))
str_left("Nigeria") str_left("Nigeria", n = 3) str_left(c("Female", "Male", "Male", "Female"))
str_mid()
returns a specific number of characters from a text string, starting at the position you specify, based on the number of characters you specify.
str_mid(string, start, n)
str_mid(string, start, n)
string |
The text string containing the characters you want to extract. |
start |
The position of the first character you want to extract in the text. The first character in text has |
n |
The length of character to extract. |
A character vector.
str_left()
which extracts characters from the left and str_right()
which extracts characters from the right.
str_mid("Super Eagle", 7, 5) str_mid("Oyo Ibadan", 5, 6)
str_mid("Super Eagle", 7, 5) str_mid("Oyo Ibadan", 5, 6)
Given a character vector, str_right()
returns the right side of a string.
str_right(string, n = 1)
str_right(string, n = 1)
string |
The character from which the right portion will be returned. |
n |
Optional. The number of characters to return from the right side of string. |
A character vector.
str_left()
which extracts characters from the left and str_mid()
which returns a segment of character strings.
str_right("Sale Price") str_right("Sale Price", n = 5)
str_right("Sale Price") str_right("Sale Price", n = 5)
str_rm_whitespace_df()
removes all leading, trailing, and collapses multiple consecutive white spaces in non-numerical variables in a data frame.
str_rm_whitespace_df(df)
str_rm_whitespace_df(df)
df |
A data frame or data frame extension (e.g. a tibble) with leading or trailing spaces. |
A clean data frame with no leading or trailing spaces.
richest_in_nigeria str_rm_whitespace_df(richest_in_nigeria)
richest_in_nigeria str_rm_whitespace_df(richest_in_nigeria)
Split up a string into pieces and extract the results using a specific index position. Mathematically, you can interpret it as follows:
Given a character string, S
, extract the element at a given position, k
, from the result of splitting S
by a given pattern, m
.
str_split_extract(string, pattern, position)
str_split_extract(string, pattern, position)
string |
Input vector. Either a character vector, or something coercible to one. |
pattern |
Pattern to look for. This may also contain regular expression. |
position |
Index position to return from the character vector. |
A character vector.
code <- c("HS-IB-EDE", "OG-OYO-CAS-0121", "NY-ILR-NIG-036") str_split_extract(code, "-", 1) str_split_extract(code, "-", 4)
code <- c("HS-IB-EDE", "OG-OYO-CAS-0121", "NY-ILR-NIG-036") str_split_extract(code, "-", 1) str_split_extract(code, "-", 4)
str_title_case()
converts string to title case, capitalizing only the first letter of each word while ignoring articles, prepositions, and conjunctions
str_title_case(string)
str_title_case(string)
string |
Input vector. Either a character vector, or something coercible to one. |
Please note that str_title_case()
is different from stringr::str_to_title()
which converts to title case, where only the first letter of each word is capitalized.
A character vector the same length as the string and in title case.
words <- "the quick brown fox jumps over a lazy dog" str_title_case(words) str_to_title(words) words <- "A journey through the history of music" str_title_case(words) str_to_title(words)
words <- "the quick brown fox jumps over a lazy dog" str_title_case(words) str_to_title(words) words <- "A journey through the history of music" str_title_case(words) str_to_title(words)