Title: | Wrapper Functions Collection Used in Data Pipelines |
---|---|
Description: | The goal of this package is to provide wrapper functions in the data cleaning and cleansing processes. These function helps in messages and interaction with the user, keep track of information in pipelines, help in the wrangling, munging, assessment and visualization of data frame-like material. |
Authors: | Guillaume Fabre [aut, cre], Maelstrom-Research [fnd] |
Maintainer: | Guillaume Fabre <[email protected]> |
License: | GPL-3 |
Version: | 2.1.0 |
Built: | 2024-11-18 05:09:50 UTC |
Source: | https://github.com/guifabre/fabr |
Add an index, possibly by group, at the first place of a data frame or a tibble The name by default is 'index' but can be named. If 'index' already exists, or the given name, the column can be forced to be created, and replace the other one.
add_index(tbl, name_index = "index", start = 1, .force = FALSE)
add_index(tbl, name_index = "index", start = 1, .force = FALSE)
tbl |
tibble or data frame |
name_index |
A character string of the name of the column. |
start |
integer indicating first index number. 1 by default. |
.force |
TRUE or FALSE, that parameter indicates whether or not the column is created if already exists. FALSE by default. |
A tibble or a data frame containing one extra first column 'index' or any given name.
{ ##### Example 1 ------------------------------------------------------------- # add an index for the tibble add_index(iris, "my_index") ##### Example 2 ------------------------------------------------------------- # add an index for the grouped tibble library(tidyr) library(dplyr) my_tbl <- tibble(iris) %>% group_by(Species) %>% slice(1:3) add_index(my_tbl, "my_index") }
{ ##### Example 1 ------------------------------------------------------------- # add an index for the tibble add_index(iris, "my_index") ##### Example 2 ------------------------------------------------------------- # add an index for the grouped tibble library(tidyr) library(dplyr) my_tbl <- tibble(iris) %>% group_by(Species) %>% slice(1:3) add_index(my_tbl, "my_index") }
Create or test for objects of type "logical", and the basic logical
constants.
This function is a wrapper of the function as.logical()
and evaluates
if the object to be coerced can be interpreted as a boolean. Any object :
NA, NA_integer, NA_Date_, (...),
0, 0L, F, FALSE, false, FaLsE, (...),
1, 1L,T, TRUE, true, TrUe, (...),
will be converted as NA, FALSE and TRUE. Any other other will return an
error.
as_any_boolean(x)
as_any_boolean(x)
x |
Object to be coerced or tested. Can be a vector. |
An logical object of the same size.
{ library(dplyr) as_any_boolean("TRUE") as_any_boolean(c("1")) as_any_boolean(0L) try(as_any_boolean(c('foo'))) as_any_boolean(c(0,1L,0,TRUE,"t","F","FALSE")) tibble(values = c(0,1L,0,TRUE,"t","F","FALSE")) %>% mutate(bool_values = as_any_boolean(values)) }
{ library(dplyr) as_any_boolean("TRUE") as_any_boolean(c("1")) as_any_boolean(0L) try(as_any_boolean(c('foo'))) as_any_boolean(c(0,1L,0,TRUE,"t","F","FALSE")) tibble(values = c(0,1L,0,TRUE,"t","F","FALSE")) %>% mutate(bool_values = as_any_boolean(values)) }
This function takes a character string or a vector. This vector is evaluates
one observation after the other, and casts the best matching date format
for each of them (independently). The best matching format is tested across
seven different formats provided by the lubridate library. The user can
specify the wanted matching format (and can be helped using
which_any_date()
for each value or guess_date_format()
for the values as a whole.
as_any_date( x = as.character(), format = c("dmy", "dym", "ymd", "ydm", "mdy", "myd", "my", "ym", "as_date") )
as_any_date( x = as.character(), format = c("dmy", "dym", "ymd", "ydm", "mdy", "myd", "my", "ym", "as_date") )
x |
object to be coerced. |
format |
A character identifying the format to apply to the object. That format can be 'ymd','ydm','dym','dmy','mdy','myd','my','ym'. |
Contrary to lubridate library or as.Date()
, the function evaluates
the different possibilities for a date. For example, c('02-03-1982') can be
either March the 2nd or February the 3rd. The function will cast the value as
NA, and a warning, since there is an ambiguity that cannot be solved, unless
the user provides the format to apply.
A R Object of class 'Date'.
lubridate::ymd()
,lubridate::ydm()
,lubridate::dmy()
,
lubridate::myd()
,lubridate::mdy()
,lubridate::dym()
,
lubridate::my()
,lubridate::ym()
,
lubridate::as_date()
,as.Date()
,
guess_date_format()
,which_any_date()
{ library(dplyr) library(tidyr) ##### Example 1 ------------------------------------------------------------- # Ambiguous dates ----------------------------------------------------------- as_any_date('19 02 12') as_any_date('19 02 12', format = "ymd") as_any_date('19 02 12', format = "dym") ##### Example 2 ------------------------------------------------------------- # Non-ambiguous dates ------------------------------------------------------- time <- tibble(time = c( "1983 07-19", "14-01-1925", "12/13/2015", "2009-09-13", "17-12-12", "coucou", "2025 jan the 30th", "1809-01-19")) time %>% mutate(new_time = as_any_date(time)) }
{ library(dplyr) library(tidyr) ##### Example 1 ------------------------------------------------------------- # Ambiguous dates ----------------------------------------------------------- as_any_date('19 02 12') as_any_date('19 02 12', format = "ymd") as_any_date('19 02 12', format = "dym") ##### Example 2 ------------------------------------------------------------- # Non-ambiguous dates ------------------------------------------------------- time <- tibble(time = c( "1983 07-19", "14-01-1925", "12/13/2015", "2009-09-13", "17-12-12", "coucou", "2025 jan the 30th", "1809-01-19")) time %>% mutate(new_time = as_any_date(time)) }
Create or test for objects of type "integer".
This function is a wrapper of the function as.integer()
and evaluates
if the object to be coerced can be interpreted as a integer. Any object :
NA, NA_integer, NA_Date_, (...),
Boolean, such as 0, 0L, F, FALSE, false, FaLsE, (...),
Any string "1", "+1", "-1", "1.0000"
will be converted as NA or integer. Any other other will return an
error.
as_any_integer(x)
as_any_integer(x)
x |
Object to be coerced or tested. Can be a vector. |
An integer object of the same size.
{ library(dplyr) as_any_integer("1") as_any_integer(c("1.000","2.0","1","+12","-12")) try(as_any_integer('foo')) tibble(values = c("1.000","2.0","1","+12","-12")) %>% mutate(bool_values = as_any_integer(values)) }
{ library(dplyr) as_any_integer("1") as_any_integer(c("1.000","2.0","1","+12","-12")) try(as_any_integer('foo')) tibble(values = c("1.000","2.0","1","+12","-12")) %>% mutate(bool_values = as_any_integer(values)) }
Create or test for objects of type "symbol".
as_any_symbol(x)
as_any_symbol(x)
x |
Object to be coerced or tested. Can be a vector, a character string, a symbol. |
Object of type "symbol".
{ as_any_symbol(coucou) as_any_symbol("coucou") }
{ as_any_symbol(coucou) as_any_symbol("coucou") }
Opens a previously generated HTML bookdown site from files in the specified folder. This is a shortcut function to access 'index.html' in the specified folder.
bookdown_open(bookdown_path)
bookdown_open(bookdown_path)
bookdown_path |
A character string identifying the folder path containing the files to open the bookdown site. |
Nothing to be returned. The function opens a web page.
bookdown_template()
,bookdown_open()
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) bookdown_render(bookdown_path, overwrite = TRUE) bookdown_open(bookdown_path) }
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) bookdown_render(bookdown_path, overwrite = TRUE) bookdown_open(bookdown_path) }
This helper function renders an existing bookdown folder (containing at least 'index.Rmd file)
bookdown_render(bookdown_path, overwrite = FALSE)
bookdown_render(bookdown_path, overwrite = FALSE)
bookdown_path |
A character string identifying the folder path where the bookdown report files are. |
overwrite |
whether to overwrite existing files. FALSE by default. |
A folder containing htlm files (in docs, ...) generated from a bookdown report.
bookdown_template()
,bookdown_open()
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) bookdown_render(bookdown_path, overwrite = TRUE) }
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) bookdown_render(bookdown_path, overwrite = TRUE) }
This helper function creates a template for a bookdown.
bookdown_template(bookdown_path, overwrite = FALSE)
bookdown_template(bookdown_path, overwrite = FALSE)
bookdown_path |
A character string identifying the folder path where the bookdown will be generated. |
overwrite |
whether to overwrite existing files. FALSE by default. |
A folder containing all files (Rmd, yml, css) to generate the bookdown.
bookdown_render()
,bookdown_open()
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) }
{ bookdown_path = tempdir() bookdown_template(bookdown_path, overwrite = TRUE) }
This function crawls and aggregates roxygen documentation into a tibble format. To work properly, elements must be separated with the named fields at title, at description, at ...), each at will be used as column name. The column name will also have 80 character to show the margin limit of each chunk of documentation.
collect_roxygen(folder_r = "R")
collect_roxygen(folder_r = "R")
folder_r |
A character string identifying the folder to index. If not specified, 'R/' is the default. |
A tibble where each line represents a function described in a package, and each column is documentation field. Most common fields (title, description, details, param, see also, return and examples are placed ahead).
{ library(tidyr) try({tibble(collect_roxygen(tempfile()))}, silent = FALSE) }
{ library(tidyr) try({tibble(collect_roxygen(tempfile()))}, silent = FALSE) }
Direct call to the online documentation for the package, which includes a description of the latest version of the package, vignettes, user guides, and a reference list of functions and help pages.
fabR_website()
fabR_website()
Nothing to be returned. The function opens a web page.
{ fabR_website() }
{ fabR_website() }
Creates a tibble listing files in a specified folder (recursively) with file path name and other useful metadata. This index can be used to quickly find files in the environment. The index also generates script to read files as R objects into the environment. Names for R objects are generated automatically from file names (R objects are not created at this step but the command line is generated and stored in the column to_eval, ready to be evaluated and generate R objects).
file_index_create(folder = getwd(), pattern = "^", negate = FALSE)
file_index_create(folder = getwd(), pattern = "^", negate = FALSE)
folder |
A character string identifying the folder to index. If not specified, the current folder is the default |
pattern |
A character string defining a pattern to sub-select within folder. Can be useful for excluding certain folders from indexing (matching by regex is supported). |
negate |
logical. If TRUE, return non-matching elements. |
The user must make sure their files are in the folder to be indexed.
A tibble with folder_path, file_path, file_name, extension, file_type columns and a last column to_eval which is R code in a character vector to read the file into the environment.
## Not run: file_index_create(tempdir()) ## End(Not run)
## Not run: file_index_create(tempdir()) ## End(Not run)
Reads all files from a file index tibble as R objects to generate in the environment or R scripts to be sourced. Any other file types will be opened in browser (html files) or in environment. If no index tibble is provided, the function creates one from the working directory. (matching by regex is supported).
file_index_read( index, file_path = "^", file_name = "^", extension = "^", file_type = "^", assign = FALSE, .envir = parent.frame() )
file_index_read( index, file_path = "^", file_name = "^", extension = "^", file_type = "^", assign = FALSE, .envir = parent.frame() )
index |
The index (tibble) of a folder with file locations and metadata, either previously generated by file_index_create() or created from folder. |
file_path |
A character string specifying a file path to search by. Can be the full string or substring (matching by regex is supported) |
file_name |
A character string a file name to search by. Can be the full string or substring (matching by regex is supported). |
extension |
A character string a file extension to search by. Can be the full string or substring (matching by regex is supported). |
file_type |
A character string a file type to search by. Can be the full string or substring (matching by regex is supported). |
assign |
If TRUE, the name is automatically assigned from the name of the object read. |
.envir |
The environment to use. parent.frame() by default |
for each file selected,
xlsx files will be read using the function read_excel_allsheets()
,
csv files will be read using the function read_csv_any_formats()
,
spss and sav files will be read using the function haven::read_spss()
,
dta files will be read using the function haven::read_dta()
,
sas7bdat and sas files will be read using the function haven::read_sas()
,
R scripts, Rmd and md files be read using the function readLines()
,
The whole files will be created in a list, which name is the name of the
file.
R objects generated in the environment or R scripts. R object names are created automatically from their file names. Otherwise return messages indicating what objects were created, or files opened, and if any troubles occurred.
read_excel_allsheets()
, read_csv_any_formats()
, haven::read_spss()
,
haven::read_dta()
, haven::read_sas()
, readLines()
## Not run: index <- file_index_create(tempdir()) file_index_read(index, file_name = my_file_name) ## End(Not run)
## Not run: index <- file_index_create(tempdir()) file_index_read(index, file_name = my_file_name) ## End(Not run)
Searches in file index R object (tibble) based on pattern and other query options and provides a table where all the files in a specified folder and corresponding to the query are listed (recursively). If no index tibble is provided, the function creates one from the working directory.
file_index_search( index, file_path = "^", file_name = "^", extension = "^", file_type = "^", show_tree = FALSE )
file_index_search( index, file_path = "^", file_name = "^", extension = "^", file_type = "^", show_tree = FALSE )
index |
The index (tibble) of a folder with file locations and metadata, either previously generated by file_index_create() or created from folder. |
file_path |
A character string specifying a file path to search by. Can be the full string or substring (matching by regex is supported) |
file_name |
A character string a file name to search by. Can be the full string or substring (matching by regex is supported). |
extension |
A character string a file extension to search by. Can be the full string or substring (matching by regex is supported). |
file_type |
A character string a file type to search by. Can be the full string or substring (matching by regex is supported). |
show_tree |
If TRUE, return the file tree of the query. |
The function displays the tree of your files. You can enable this functionality with 'show_tree = TRUE'
A tibble with indexed information for files matching the query.
## Not run: index <- file_index_create(tempdir()) file_index_search(index, file_name = my_file_name) ## End(Not run)
## Not run: index <- file_index_create(tempdir()) file_index_search(index, file_name = my_file_name) ## End(Not run)
This helper function extracts the names of the columns in a tibble having NA values for all observations.
get_all_na_cols(tbl)
get_all_na_cols(tbl)
tbl |
R object(dataframe or tibble) of the input tibble |
A vector string indicating either that the tibble does not have empty columns or the names of the empty columns.
{ ##### Example 1 ------------------------------------------------------------- # All columns have observation get_all_na_cols(iris) ##### Example 2 ------------------------------------------------------------- # One column doesn't have any observations library(dplyr) get_all_na_cols(mutate(iris, new_col = NA)) }
{ ##### Example 1 ------------------------------------------------------------- # All columns have observation get_all_na_cols(iris) ##### Example 2 ------------------------------------------------------------- # One column doesn't have any observations library(dplyr) get_all_na_cols(mutate(iris, new_col = NA)) }
This helper function extracts the row number(s) having NA value for all columns.
get_all_na_rows(tbl, id_col = NULL)
get_all_na_rows(tbl, id_col = NULL)
tbl |
R object(dataframe or tibble) of the input tibble |
id_col |
A character string specifying the column to ignore in identification of repeated observations. If NULL (by default), all of the columns will be taken in account for repeated observation identification. The row number will be used to identify those observations. |
A vector string indicating either that the tibble does not have empty observation or the row number of the empty observations.
{ ##### Example 1 ------------------------------------------------------------- # All rows have observation get_all_na_rows(iris) ##### Example 2 ------------------------------------------------------------- # One row doesn't have any observations library(dplyr) get_all_na_rows(bind_rows(iris, tibble(Species = c(NA,NA)))) get_all_na_rows( tbl = bind_rows(iris, tibble(Species = c('id_151', 'id_152'))), id_col = 'Species') }
{ ##### Example 1 ------------------------------------------------------------- # All rows have observation get_all_na_rows(iris) ##### Example 2 ------------------------------------------------------------- # One row doesn't have any observations library(dplyr) get_all_na_rows(bind_rows(iris, tibble(Species = c(NA,NA)))) get_all_na_rows( tbl = bind_rows(iris, tibble(Species = c('id_151', 'id_152'))), id_col = 'Species') }
This helper function extracts the names of the columns in a tibble having identical values for all observations.
get_duplicated_cols(tbl)
get_duplicated_cols(tbl)
tbl |
R object(dataframe or tibble) of the input tibble |
A tibble indicating which columns which values is the same in the tibble
{ library(dplyr) tbl <- mtcars %>% mutate( cyl_2 = cyl, cyl_3 = cyl, mpg_2 = mpg) get_duplicated_cols(tbl) }
{ library(dplyr) tbl <- mtcars %>% mutate( cyl_2 = cyl, cyl_3 = cyl, mpg_2 = mpg) get_duplicated_cols(tbl) }
This helper function extracts the row number (or first column value) in a tibble having identical values for all columns. This function can be used either on the whole columns or excluding the first column (id) (which can be useful to identify repeated observation across different ids)
get_duplicated_rows(tbl, id_col = NULL)
get_duplicated_rows(tbl, id_col = NULL)
tbl |
R object(dataframe or tibble) of the input tibble |
id_col |
A character string specifying the column to ignore in identification of repeated observations. If NULL (by default), all of the columns will be taken in account for repeated observation identification. The row number will be used to identify those observations. |
A tibble indicating which row which values is the same in the tibble
{ # the row numbers are returned to identify which observations have repeated # values library(dplyr) get_duplicated_rows(tbl = bind_rows( tbl = mtcars, mtcars[1,])) get_duplicated_rows( tbl = bind_rows(mtcars,mtcars[1,]) %>% add_index() %>% mutate(index = paste0('obs_',index)), id_col = 'index') }
{ # the row numbers are returned to identify which observations have repeated # values library(dplyr) get_duplicated_rows(tbl = bind_rows( tbl = mtcars, mtcars[1,])) get_duplicated_rows( tbl = bind_rows(mtcars,mtcars[1,]) %>% add_index() %>% mutate(index = paste0('obs_',index)), id_col = 'index') }
Function that recursively go through a list object and store in a tibble the
path of each element in the list. The paths can be after that edited and
accessed using parceval()
for example.
get_path_list(list_obj, .map_list = NULL)
get_path_list(list_obj, .map_list = NULL)
list_obj |
R list object to be evaluated |
.map_list |
non usable parameter. This parameter is only there to ensure recursivity. Any modification of this object returns NULL |
A tibble containing all the paths of each element of the list and the class of each leaf (can be a list, or R objects).
{ library(dplyr) get_path_list( list( tibble = iris, list = list(t1 = mtcars, t2 = tibble(iris)), char = "foo")) }
{ library(dplyr) get_path_list( list( tibble = iris, list = list(t1 = mtcars, t2 = tibble(iris)), char = "foo")) }
This helper function extracts the names of the columns in a tibble having unique value for all observations.
get_unique_value_cols(tbl)
get_unique_value_cols(tbl)
tbl |
R object(dataframe or tibble) of the input tibble |
A vector string indicating either that the tibble does not have empty columns or the names of the empty columns.
{ ##### Example 1 ------------------------------------------------------------- # All columns have distinct observation get_unique_value_cols(iris) ##### Example 2 ------------------------------------------------------------- # One column doesn't have distinct observations get_unique_value_cols(tbl = iris[1:50,]) }
{ ##### Example 1 ------------------------------------------------------------- # All columns have distinct observation get_unique_value_cols(iris) ##### Example 2 ------------------------------------------------------------- # One column doesn't have distinct observations get_unique_value_cols(tbl = iris[1:50,]) }
This function takes a tibble and a specific column. This column is evaluated
one observation after the other, and finally gives the best matching date
format for the whole column. The best matching format is tested across seven
different formats provided by the lubridate library. Along with the format,
the percentage of matching is given in the output tibble. The information of
the best matching format can be used to mutate a column using
as_any_date()
. The default format is yyyy-mm-dd.
guess_date_format(tbl, col = NULL)
guess_date_format(tbl, col = NULL)
tbl |
R object(dataframe or tibble) of the input tbl |
col |
A character string specifying a column of interest |
Contrary to lubridate library or as.Date()
, the function evaluates
the column as a whole, and does not cast the column if there is ambiguity
between values. For example, ('19-07-1983', '02-03-1982') implies that 02
refers to the day and 03 refers to the month, since that order works for the
first element, and doesn't otherwise.
A tibble with information concerning the best matching date format, given an object to be evaluated.
lubridate::ymd()
,lubridate::ydm()
,lubridate::dmy()
,
lubridate::myd()
,lubridate::mdy()
,lubridate::dym()
,
lubridate::my()
,lubridate::ym()
,
lubridate::as_date()
,as.Date()
,
which_any_date()
,as_any_date()
{ library(tidyr) ##### Example 1 ------------------------------------------------------------- # Non-ambiguous dates ---------------------------------------------------- time <- tibble(time = c( "1983-07-19", "2003-01-14", "2010-09-29", "2023-12-12", "2009-09-03", "1509-11-30", "1809-01-01")) guess_date_format(time) ##### Example 2 ------------------------------------------------------------- # Ambiguous dates ---------------------------------------------------- time <- tibble(time = c( "1983-19-07", "1983-10-13", "2009-09-03", "1509-11-30")) guess_date_format(time) ##### Example 3 ------------------------------------------------------------- # Non date format dates -------------------------------------------------- time <- tibble(time = c( "198-07-19", "200-01-14", "201-09-29", "202-12-12", "2000-09-03", "150-11-3d0", "180-01-01")) guess_date_format(time) }
{ library(tidyr) ##### Example 1 ------------------------------------------------------------- # Non-ambiguous dates ---------------------------------------------------- time <- tibble(time = c( "1983-07-19", "2003-01-14", "2010-09-29", "2023-12-12", "2009-09-03", "1509-11-30", "1809-01-01")) guess_date_format(time) ##### Example 2 ------------------------------------------------------------- # Ambiguous dates ---------------------------------------------------- time <- tibble(time = c( "1983-19-07", "1983-10-13", "2009-09-03", "1509-11-30")) guess_date_format(time) ##### Example 3 ------------------------------------------------------------- # Non date format dates -------------------------------------------------- time <- tibble(time = c( "198-07-19", "200-01-14", "201-09-29", "202-12-12", "2000-09-03", "150-11-3d0", "180-01-01")) guess_date_format(time) }
Generate a name for an element in a list. This function is targeted for
functions creations which handle lists. Those lists may need names to go
through each elements. This function can works with stats::setNames()
and
allows the user to provide name shorter, more user-friendly in their lists.
make_name_list(args_list, list_elem)
make_name_list(args_list, list_elem)
args_list |
A list of character string of same length of list_elem |
list_elem |
A list of character string of same length of args_list |
A character string simplified to be used as names in a list.
{ library(tidyr) library(stats) #### Example 1 -------------------------------------------------------------- # make_name_list generates names that are informative through a line of code # or function. tibble(iris), iris %>% tibble and # list(iris = tibble(mytibble) %>% select(Species)) will have 'iris' as name. list(tibble(iris), tibble(mtcars)) %>% setNames(make_name_list(list(tibble(iris), tibble(mtcars)), args_list = c("IRIS %>% complicated_code","complicated_function(MTCARS)"))) #### Example 2 -------------------------------------------------------------- # make_name_list can be used when a function uses arguments provided by the # user to generate a list. The name is simplified and given to the list # itself library(dplyr) my_function <- function(df){ .fargs <- as.list(match.call(expand.dots = TRUE)) list_df <- list(df) %>% setNames(.,make_name_list(as.character(.fargs['df']),list(df))) return(list_df)} my_function(tibble(iris)) my_function(iris %>% tibble %>% select(Species)) }
{ library(tidyr) library(stats) #### Example 1 -------------------------------------------------------------- # make_name_list generates names that are informative through a line of code # or function. tibble(iris), iris %>% tibble and # list(iris = tibble(mytibble) %>% select(Species)) will have 'iris' as name. list(tibble(iris), tibble(mtcars)) %>% setNames(make_name_list(list(tibble(iris), tibble(mtcars)), args_list = c("IRIS %>% complicated_code","complicated_function(MTCARS)"))) #### Example 2 -------------------------------------------------------------- # make_name_list can be used when a function uses arguments provided by the # user to generate a list. The name is simplified and given to the list # itself library(dplyr) my_function <- function(df){ .fargs <- as.list(match.call(expand.dots = TRUE)) list_df <- list(df) %>% setNames(.,make_name_list(as.character(.fargs['df']),list(df))) return(list_df)} my_function(tibble(iris)) my_function(iris %>% tibble %>% select(Species)) }
Shortcut allowing to provide user a prompt and a message that is to be read and validated before pursuing process. This function is targeted for function creators where user interaction is required.
message_on_prompt(...)
message_on_prompt(...)
... |
String character to put in a message |
Nothing to be returned. The function sends a message as a prompt in the console.
{ message_on_prompt("Do you want to continue? Press `enter` or `esc`") }
{ message_on_prompt("Do you want to continue? Press `enter` or `esc`") }
Shortcut to parse()
and eval()
evaluate R expression in a
character string, and turns it into actual R code. This function is targeted
for interaction with external files (where expression is stored in text
format) ; for tidy elements where code expression is generated using
dplyr::mutate()
, combined with paste0()
; in for while, map, etc.
loops where character string expression can be indexed or iteratively
generated and evaluated ; objects to be created (using assign, <- or <<- obj)
where the name of the R object is stored in a string. Some issues may occur
when parceval is used in a different environment, such as in a function.
Prefer eval(parse(text = ...) instead.
parceval(...)
parceval(...)
... |
String character to be parsed and evaluated |
Any output generated by the evaluation of the string character.
{ ##### Example 1 ------------------------------------------------------------- # Simple assignation will assign 'b' in parceval environment (which is # associated to a function and different from .GlobalEnv, by definition). # Double assignation will put 'b' in .GlobalEnv. # (similar to assign(x = "b",value = 1,envir = .GlobalEnv)) a <- 1 parceval("print(a)") ##### Example 2 ------------------------------------------------------------- # use rowwise to directly use parceval in a tibble, or use a for loop. library(dplyr) library(tidyr) tibble(cars) %>% mutate( to_eval = paste0(speed,"/",dist)) %>% rowwise() %>% mutate( eval = parceval(to_eval)) ##### Example 3 ------------------------------------------------------------- # parceval can be parcevaled itself! code_R <- 'as_tibble(cars) %>% mutate( to_eval = paste0(speed,"/",dist)) %>% rowwise() %>% mutate( eval = parceval(to_eval))' cat(code_R) parceval(code_R) }
{ ##### Example 1 ------------------------------------------------------------- # Simple assignation will assign 'b' in parceval environment (which is # associated to a function and different from .GlobalEnv, by definition). # Double assignation will put 'b' in .GlobalEnv. # (similar to assign(x = "b",value = 1,envir = .GlobalEnv)) a <- 1 parceval("print(a)") ##### Example 2 ------------------------------------------------------------- # use rowwise to directly use parceval in a tibble, or use a for loop. library(dplyr) library(tidyr) tibble(cars) %>% mutate( to_eval = paste0(speed,"/",dist)) %>% rowwise() %>% mutate( eval = parceval(to_eval)) ##### Example 3 ------------------------------------------------------------- # parceval can be parcevaled itself! code_R <- 'as_tibble(cars) %>% mutate( to_eval = paste0(speed,"/",dist)) %>% rowwise() %>% mutate( eval = parceval(to_eval))' cat(code_R) parceval(code_R) }
The csv file is read twice to detect the number of lines to use in attributing the column type ('guess_max' parameter of read_csv). This avoids common errors when reading csv files.
read_csv_any_formats(filename)
read_csv_any_formats(filename)
filename |
A character string of the path of the csv file. |
A tibble corresponding to the csv read.
readr::read_csv()
, readr::read_csv2()
{ try(read_csv_any_formats(filename = tempfile()),silent = TRUE) }
{ try(read_csv_any_formats(filename = tempfile()),silent = TRUE) }
readxl::read_excel()
recursivelyThe Excel file is read and the values are placed in a list of tibbles, with each sheet in a separate element in the list. If the Excel file has only one sheet, the output is a single tibble.
The Excel file is read and the values are placed in a list of tibbles, with each sheet in a separate element in the list. If the Excel file has only one sheet, the output is a single tibble.
read_excel_allsheets(filename, sheets = "", keep_as_list = FALSE) read_excel_allsheets(filename, sheets = "", keep_as_list = FALSE)
read_excel_allsheets(filename, sheets = "", keep_as_list = FALSE) read_excel_allsheets(filename, sheets = "", keep_as_list = FALSE)
filename |
A character string of the path of the Excel file. |
sheets |
A vector containing only the sheets to be read. |
keep_as_list |
A Boolean to say whether the object should be a list or a tibble, when there is only one sheet provided. FALSE by default. |
A list of tibbles corresponding to the sheets read, or a single tibble if the number of sheets is one.
A list of tibbles corresponding to the sheets read, or a single tibble if the number of sheets is one.
{ try(read_excel_allsheets(filename = tempfile()), silent = TRUE) } { try(read_excel_allsheets(filename = tempfile()), silent = TRUE) }
{ try(read_excel_allsheets(filename = tempfile()), silent = TRUE) } { try(read_excel_allsheets(filename = tempfile()), silent = TRUE) }
Shortcut avoiding user to get messages, warnings and being stopped by an
error. The usage is very similar to suppressWarnings()
. This function
is targeted for function creators where user experience enhancement is
sought.
silently_run(...)
silently_run(...)
... |
R code |
The output of the R code, unless the output is a message, a warning or an error, nothing will be returned in that case.
invisible()
, suppressWarnings()
, suppressMessages()
{ as.integer("text") silently_run(as.integer("text")) }
{ as.integer("text") silently_run(as.integer("text")) }
This function takes a character string or a vector. This vector is evaluates
one observation after the other, and gives the best matching date format
for each of them (independently). The best matching format is tested across
seven different formats provided by the lubridate library. The information of
the best matching format can be used to mutate a column using
as_any_date()
.
which_any_date( x, format = c("ymd", "ydm", "dmy", "myd", "mdy", "dym", "my", "ym", "as_date") )
which_any_date( x, format = c("ymd", "ydm", "dmy", "myd", "mdy", "dym", "my", "ym", "as_date") )
x |
object to be coerced. Can be a character string or a vector. |
format |
A character identifying the format to apply to the object to test. That format can be 'ymd','ydm','dmy','myd','mdy','dym', 'ym', 'my' or 'as_date' in that specific order ('ymd" will be chose as a default format, then 'ymd', etc.). |
Contrary to lubridate library or as.Date()
, the function evaluates
the different possibilities for a date. For example, c('02-03-1982') can be
either March the 2nd or February the 3rd. The function will provide
"mdy, dmy" as possible formats. If no format is found, the function returns
NA.
A character string of the possible date formats given a parameter to be tested. The length of the vector is the length of the input object.
lubridate::ymd()
,lubridate::ydm()
,lubridate::dmy()
,
lubridate::myd()
,lubridate::mdy()
,lubridate::dym()
,
lubridate::my()
,lubridate::ym()
,
lubridate::as_date()
,as.Date()
,
guess_date_format()
,as_any_date()
{ time <- c( "1983-07-19", "31 jan 2017", "1988/12/17", "31-02-05", "02-02-02", "2017 october the 2nd", "02-07-2012", "19-07-83", "19-19-19") which_any_date(time) }
{ time <- c( "1983-07-19", "31 jan 2017", "1988/12/17", "31-02-05", "02-02-02", "2017 october the 2nd", "02-07-2012", "19-07-83", "19-19-19") which_any_date(time) }
writexl::write_xlsx()
recursivelyThe R objects are read and the values are placed in separated sheets. This function is inspired by the function proposed in https://statmethods.wordpress.com/2014/06/19/quickly-export-multiple-r-objects-to-an-excel-workbook/
The R objects are read and the values are placed in separated sheets. This function is inspired by the function proposed in https://statmethods.wordpress.com/2014/06/19/quickly-export-multiple-r-objects-to-an-excel-workbook/
write_excel_allsheets(list, filename) write_excel_allsheets(list, filename)
write_excel_allsheets(list, filename) write_excel_allsheets(list, filename)
list |
R objects, coma separated. |
filename |
A character string of the path of the Excel file. |
Nothing to be returned. The file is created at the path declared in the environment.
Nothing to be returned. The file is created at the path declared in the environment.
{ unlink( write_excel_allsheets( list = list(iris = iris, mtcars = mtcars), filename = tempfile())) } { unlink( write_excel_allsheets( list = list(iris = iris, mtcars = mtcars), filename = tempfile())) }
{ unlink( write_excel_allsheets( list = list(iris = iris, mtcars = mtcars), filename = tempfile())) } { unlink( write_excel_allsheets( list = list(iris = iris, mtcars = mtcars), filename = tempfile())) }