Skip to contents

Processes A 'chapter_overview' Data Frame

Usage

refine_chapter_overview(
  chapter_overview = NULL,
  data = NULL,
  chunk_templates = NULL,
  label_separator = " - ",
  name_separator = NULL,
  single_y_bivariates_if_indep_cats_above = 3,
  single_y_bivariates_if_deps_above = 20,
  always_show_bi_for_indep = NULL,
  hide_bi_entry_if_sig_above = 1,
  hide_chunk_if_n_below = 10,
  hide_variable_if_all_na = TRUE,
  keep_dep_indep_if_no_overlap = FALSE,
  organize_by = c("chapter", ".variable_label_prefix_dep", ".variable_name_indep",
    ".template_name"),
  arrange_section_by = c(chapter = FALSE, .variable_name_dep = FALSE,
    .variable_name_indep = FALSE, .template_name = FALSE),
  na_first_in_section = TRUE,
  max_width_obj = 128,
  max_width_chunk = 128,
  max_width_file = 64,
  max_width_folder_name = 12,
  sep_obj = "_",
  sep_chunk = "-",
  sep_file = "-",
  ...,
  progress = TRUE,
  variable_group_dep = ".variable_group_dep",
  variable_group_prefix = NULL,
  n_range_glue_template_1 = "{n}",
  n_range_glue_template_2 = "[{n[1]}-{n[2]}]",
  log_file = NULL
)

Arguments

chapter_overview

What goes into each chapter and sub-chapter

obj:<data.frame>|obj:<tbl_df> // Required

Data frame (or tibble, possibly grouped). One row per chapter. Should contain the columns 'chapter' and 'dep', Optionally 'indep' (independent variables) and other informative columns as needed.

data

Survey data

obj:<data.frame>|obj:<tbl_df>|obj:<srvyr> // Required

A data frame (or a srvyr-object) with the columns specified in the chapter_structure 'dep', etc columns.

chunk_templates

Chunk templates

obj:<data.frame>|obj:<tbl_df>|NULL // default: NULL (optional)

Must contain columns name (user-specified unique name for the template), template (the chunk template as {glue}-specification, variable_type_dep and optionally variable_type_indep. The latter two are list-columns of prototype vectors specifying which data the template will be applied to. Can optionally contain columns whose names match the default options for the function. These will then override the default function-wide options for the specific template.

label_separator

Variable label separator

scalar<character> // default: NULL (optional)

String to split labels on main question and sub-items.

name_separator

Variable name separator

scalar<character> // default: NULL (optional)

String to split column names in data between main question and sub-items

single_y_bivariates_if_indep_cats_above

Single y bivariates if indep-cats above ...

scalar<integer> // default: 3 (optional)

Figures and tables for bivariates can become very long if the independent variable has many categories. This argument specifies the number of indep categories above which only single y bivariates should be shown.

single_y_bivariates_if_deps_above

Single y bivariates if dep-vars above ...

scalar<integer> // default: 20 (optional)

Figures and tables for bivariates can become very long if there are many dependent variables in a battery/question matrix. This argument specifies the number of dep variables above which only single y bivariates should be shown. Set to 0 to always show single y bivariates.

always_show_bi_for_indep

Always show bivariate for indep-variable

vector<character> // default: NULL (optional)

Specific combinations with a by-variable where bivariates should always be shown.

hide_bi_entry_if_sig_above

p-value threshold for hiding bivariate entry

scalar<double> // default: 1 (optional)

Whether to hide bivariate entry if significance is above this value. Defaults to showing all.

hide_chunk_if_n_below

Hide result if N below

scalar<integer> // default: 10 (optional)

Whether to hide result if N for a given dataset is below this value. NOTE: Exceptions will be made to chr_table and chr_plot as these are typically exempted in the first place. This might change in the future with a separate argument.

hide_variable_if_all_na

Hide variable from outputs if containing all NA

scalar<boolean> // default: TRUE (optional)

Whether to remove variables if all values are NA.

keep_dep_indep_if_no_overlap

Keep dep-indep if no overlap

scalar<boolean> // default: FALSE (optional)

Whether to keep dep-indep rows if there is no overlap.

organize_by

Grouping columns

vector<character> // default: NULL (optional)

Column names used for identifying chapters and sections.

arrange_section_by

Grouping columns

vector<character> or named vector<logical> // default: NULL (optional)

Column names used for sorting section within each organize_by group. If character vector, will assume all are to be arranged in ascending order. If a named logical vector, FALSE will indicate ascending, TRUE descending. Defaults to sorting in ascending order (alphabetical) for commonly needed variable name/label info, and in descending order for chunk_templates as one typically wants univariates before bivariates.

na_first_in_section

Whether to place NAs first when sorting

scalar<logical> // default: TRUE (optional)

Default ascending and descending sorting with dplyr::arrange() is to place NAs at the end. This would have placed univariates at the end, etc. Thus, saros places NAs first in the section. Set this to FALSE to override.

max_width_obj, max_width_chunk, max_width_file

Maximum object width

scalar<integer> // default: NULL (optional)

Maximum width for names of objects (in R/Python environment), chunks (#| label: ) and optional files. Note, will always replace variable labels with variable names, to avoid very long file names. Note for filenames: Due to OneDrive having a max path of about 400 characters, this can quickly be exceeded with a long path base path, long file names if using labels as part of structure, and hashing with Quarto's cache: true feature. Thus consider restricting max_width_file to lower than what you optimally would have wished for.

max_width_folder_name

Maximum clean folder name length

scalar<integer> // default: NULL (optional)

Whereas max_width_file truncates the file name, this argument truncates the folder name. It will not impact the report or chapter names in website, only the folders.

sep_obj, sep_chunk, sep_file

Separator string

scalar<character> // default: "_" (optional)

Separator to use between grouping variables. Defaults to underscore for object names and hyphen for chunk labels and file names.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

progress

Whether to display progress message

scalar<logical> // default: TRUE

Mostly useful when hide_bi_entry_if_sig_above < 1

variable_group_dep

Name for the variable_group_dep column

scalar<string> // default: ".variable_group_dep"

This column is used to group variables that are part of the same bivariate analysis.

variable_group_prefix

Set a prefix to more easily find it in your labels

scalar<string> // default: NULL

By default, the .variable_group column is just integers. If you wish to use this as part of your object/label/filename numbering scheme, a number by itself will not be very informative. Hence you could set a prefix such as "Group" to distinguish this column from other columns in the chapter_structure.

n_range_glue_template_1, n_range_glue_template_2

scalar<string> // default: "{n}" and "[{n[1]}, {n[2]}] (optional)

Glue templates for the n_range columns to be created.

log_file

Path to log file

scalar<string> // default: "_log.txt" (optional)

Path to log file. Set to NULL to disable logging.

Value

Grouped tibble.

Examples

ref_df <- refine_chapter_overview(chapter_overview = ex_survey_ch_overview)
#> `chunk_templates` is NULL. Using global defaults.
#> Refining chapter_overview into a chapter_structure ...
# \donttest{
ref_df2 <- refine_chapter_overview(
  chapter_overview = ex_survey_ch_overview,
  data = ex_survey, hide_bi_entry_if_sig_above = .05
)
#> `chunk_templates` is NULL. Using global defaults.
#> Refining chapter_overview into a chapter_structure ...
#> Removing bivariate occurences if `hide_bi_entry_if_sig_above`: `0.05`, except .
#> Refining chapter_overview into a chapter_structure ...

#> Not using the following variables in `data`: `x1_sex`, `x2_human`, `a_7`,
#> `a_8`, `c_1`, `c_2`, `f_uni`, and `resp_status`.
# }