Creating gt tables

library(getACS)
library(dplyr)
library(gt)

One feature of the {getACS} package is support for building tables with the {gt} package. To demonstrate, we need data for a few different tables from the American Community Survey:

acs_data <- get_acs_tables(
  geography = "county",
  state = "MD",
  table = c("B01003", "B15003", "B25003"),
  quiet = TRUE
)

To start, we can use filter_acs() to filter one or more tables from the ACS data frame:

pop_tbl_data <- acs_data |>
  filter_acs(
    table = "B01003"
  ) |>
  slice_max(estimate, n = 5)

Then, you can use select_acs() to select the estimate, percent estimate, name, and column title columns. In this example, setting column_title_col and perc_est_cols to NULL drops those columns from the data frame:

pop_tbl_data <- pop_tbl_data |>
  select_acs(
    .value_col = "estimate",
    .name_col = "NAME",
    .column_title_col = NULL,
    .perc_prefix = NULL
  )

The main table building function is gt_acs() which is a wrapper for gt::gt(), gt::cols_label(), gt::cols_merge_uncert() and other {gt} functions. Based on the predictable structure of ACS data, this function can merge estimate and margin of error columns, format estimate and percent estimate columns, and set a source note with a survey and table attribution.

pop_tbl_data |>
  gt_acs(
    table = "B01003",
    value_label = "Population",
    name_label = "County",
    perc = FALSE
  ) |>
  fmt_acs_county(
    state = "Maryland",
    pattern = "(County|), {state}"
  )
County Population
Montgomery 1,056,910
Prince George's 957,189
Baltimore 850,737
Anne Arundel 588,109
Baltimore city 584,548
Source: 2018-2022 ACS 5-year Estimates, Table B01003.

Additional helpers can support common formatting tasks when working with American Community Survey data. For example, the function fmt_acs_county() strips the state name and trailing comma from the ACS data frame name column.

Many helper functions are built around tidyverse functions so additional parameters passed to filter_data() are passed to dplyr::filter() so subsetting data by indent, line_number, or other attributes is straightforward:

edu_tbl_data <- acs_data |>
  filter_acs(
    table = "B15003",
    indent > 0,
    line_number > 16,
    NAME == "Baltimore city, Maryland"
  )

Similarly, gt_acs() returns a gt_tbl object so it can be combined with other {gt} functions to add headers or customize tables in other ways:

edu_tbl_data |>
  select_acs(.name_col = NULL) |>
  gt_acs(
    table = edu_tbl_data$table_id,
    column_title_label = "Education",
    value_label = "Estimate",
    perc_value_label = "% of total"
  ) |>
  tab_header(
    edu_tbl_data$table_title[[1]],
    edu_tbl_data$NAME[[1]]
  )
Educational Attainment for the Population 25 Years and Over
Baltimore city, Maryland
Education Estimate % of total
Regular high school diploma 95,744 ± 2,517 23% ± 1%
GED or alternative credential 19,797 ± 1,141 5% ± 0%
Some college, less than 1 year 24,907 ± 1,545 6% ± 0%
Some college, 1 or more years, no degree 52,364 ± 2,404 13% ± 1%
Associate's degree 21,160 ± 1,233 5% ± 0%
Bachelor's degree 72,434 ± 2,280 18% ± 1%
Master's degree 47,683 ± 1,882 12% ± 0%
Professional school degree 13,069 ± 764 3% ± 0%
Doctorate degree 9,988 ± 829 2% ± 0%
Source: 2018-2022 ACS 5-year Estimates, Table B15003.

This flexibility makes it easy to quickly produce useful tables:

tenure_tbl_data <- acs_data |>
  filter_acs(
    table = "B25003",
    NAME == "Baltimore city, Maryland"
  )

tenure_tbl_data |>
  select_acs(.name_col = NULL) |>
  gt_acs(
    rowname_col = "column_title",
    value_label = "Units",
    # Only unique table values are used to append to the source note
    table = tenure_tbl_data$table_id
  ) |>
  tab_header(
    title = unique(tenure_tbl_data$table_title),
    subtitle = "Baltimore City, Maryland"
  )
Tenure
Baltimore City, Maryland
Units % share
Total 247,232 ± 1,271 100% ± 1%
Owner occupied 118,072 ± 1,945 48% ± 1%
Renter occupied 129,160 ± 2,300 52% ± 1%
Source: 2018-2022 ACS 5-year Estimates, Table B25003.

Helpers such as pivot_acs_wider() can also be helpful even without the gt_acs_compare() variant:

geo_comparison_data <- get_acs_geographies(
  geography = c(
    "county",
    "metropolitan statistical area/micropolitan statistical area",
    "state"
  ),
  county = "Baltimore city",
  state = "MD",
  msa = "Baltimore-Columbia-Towson, MD Metro Area",
  table = "B25105",
  quiet = TRUE
)

geo_comparison_data |>
  select_acs(
    .name_col = "NAME",
    .perc = FALSE
  ) |>
  pivot_acs_wider() |>
  gt_acs(
    value_label = NULL,
    column_title_label = "",
    currency_value = TRUE
  ) |>
  cols_label_with(
    fn = function(x) {
      stringr::str_remove(x, "^estimate_")
    }
  )
Baltimore city, Maryland Baltimore-Columbia-Towson, MD Metro Area Maryland
$1,275 ± $13 $1,611 ± $8 $1,708 ± $6
Source: 2018-2022 ACS 5-year Estimates.

The gt_acs_compare() calls pivot_acs_wider() and handles the process of renaming column titles for a similar effect:

geo_comparison_data |>
  gt_acs_compare(
    id_cols = "column_title",
    column_title_label = "",
    currency_value = TRUE
  )
Baltimore city, Maryland
Baltimore-Columbia-Towson, MD Metro Area
Maryland
Est. Est. Est.
Median monthly housing costs $1,275 ± $13 $1,611 ± $8 $1,708 ± $6
Source: 2018-2022 ACS 5-year Estimates.

Note that, if you encounter warning or error messages, you may need to set a value for id_cols to resolve the issue.