One feature of the {getACS}
package is support for
building tables with the {gt}
package. To demonstrate, we
need data for a few different tables from the American Community
Survey:
acs_data <- get_acs_tables(
geography = "county",
state = "MD",
table = c("B01003", "B15003", "B25003"),
quiet = TRUE
)
To start, we can use filter_acs()
to filter one or more
tables from the ACS data frame:
Then, you can use select_acs()
to select the estimate,
percent estimate, name, and column title columns. In this example,
setting column_title_col
and perc_est_cols
to
NULL
drops those columns from the data frame:
pop_tbl_data <- pop_tbl_data |>
select_acs(
.value_col = "estimate",
.name_col = "NAME",
.column_title_col = NULL,
.perc_prefix = NULL
)
The main table building function is gt_acs()
which is a
wrapper for gt::gt()
, gt::cols_label()
,
gt::cols_merge_uncert()
and other {gt}
functions. Based on the predictable structure of ACS data, this function
can merge estimate and margin of error columns, format estimate and
percent estimate columns, and set a source note with a survey and table
attribution.
pop_tbl_data |>
gt_acs(
table = "B01003",
value_label = "Population",
name_label = "County",
perc = FALSE
) |>
fmt_acs_county(
state = "Maryland",
pattern = "(County|), {state}"
)
County | Population |
---|---|
Montgomery | 1,056,910 |
Prince George's | 957,189 |
Baltimore | 850,737 |
Anne Arundel | 588,109 |
Baltimore city | 584,548 |
Source: 2018-2022 ACS 5-year Estimates, Table B01003. |
Additional helpers can support common formatting tasks when working
with American Community Survey data. For example, the function
fmt_acs_county()
strips the state name and trailing comma
from the ACS data frame name column.
Many helper functions are built around tidyverse functions so
additional parameters passed to filter_data()
are passed to
dplyr::filter()
so subsetting data by indent, line_number,
or other attributes is straightforward:
edu_tbl_data <- acs_data |>
filter_acs(
table = "B15003",
indent > 0,
line_number > 16,
NAME == "Baltimore city, Maryland"
)
Similarly, gt_acs()
returns a gt_tbl
object
so it can be combined with other {gt}
functions to add
headers or customize tables in other ways:
edu_tbl_data |>
select_acs(.name_col = NULL) |>
gt_acs(
table = edu_tbl_data$table_id,
column_title_label = "Education",
value_label = "Estimate",
perc_value_label = "% of total"
) |>
tab_header(
edu_tbl_data$table_title[[1]],
edu_tbl_data$NAME[[1]]
)
Educational Attainment for the Population 25 Years and Over | ||
Baltimore city, Maryland | ||
Education | Estimate | % of total |
---|---|---|
Regular high school diploma | 95,744 ± 2,517 | 23% ± 1% |
GED or alternative credential | 19,797 ± 1,141 | 5% ± 0% |
Some college, less than 1 year | 24,907 ± 1,545 | 6% ± 0% |
Some college, 1 or more years, no degree | 52,364 ± 2,404 | 13% ± 1% |
Associate's degree | 21,160 ± 1,233 | 5% ± 0% |
Bachelor's degree | 72,434 ± 2,280 | 18% ± 1% |
Master's degree | 47,683 ± 1,882 | 12% ± 0% |
Professional school degree | 13,069 ± 764 | 3% ± 0% |
Doctorate degree | 9,988 ± 829 | 2% ± 0% |
Source: 2018-2022 ACS 5-year Estimates, Table B15003. |
This flexibility makes it easy to quickly produce useful tables:
tenure_tbl_data <- acs_data |>
filter_acs(
table = "B25003",
NAME == "Baltimore city, Maryland"
)
tenure_tbl_data |>
select_acs(.name_col = NULL) |>
gt_acs(
rowname_col = "column_title",
value_label = "Units",
# Only unique table values are used to append to the source note
table = tenure_tbl_data$table_id
) |>
tab_header(
title = unique(tenure_tbl_data$table_title),
subtitle = "Baltimore City, Maryland"
)
Tenure | ||
Baltimore City, Maryland | ||
Units | % share | |
---|---|---|
Total | 247,232 ± 1,271 | 100% ± 1% |
Owner occupied | 118,072 ± 1,945 | 48% ± 1% |
Renter occupied | 129,160 ± 2,300 | 52% ± 1% |
Source: 2018-2022 ACS 5-year Estimates, Table B25003. |
Helpers such as pivot_acs_wider()
can also be helpful
even without the gt_acs_compare()
variant:
geo_comparison_data <- get_acs_geographies(
geography = c(
"county",
"metropolitan statistical area/micropolitan statistical area",
"state"
),
county = "Baltimore city",
state = "MD",
msa = "Baltimore-Columbia-Towson, MD Metro Area",
table = "B25105",
quiet = TRUE
)
geo_comparison_data |>
select_acs(
.name_col = "NAME",
.perc = FALSE
) |>
pivot_acs_wider() |>
gt_acs(
value_label = NULL,
column_title_label = "",
currency_value = TRUE
) |>
cols_label_with(
fn = function(x) {
stringr::str_remove(x, "^estimate_")
}
)
Baltimore city, Maryland | Baltimore-Columbia-Towson, MD Metro Area | Maryland |
---|---|---|
$1,275 ± $13 | $1,611 ± $8 | $1,708 ± $6 |
Source: 2018-2022 ACS 5-year Estimates. |
The gt_acs_compare()
calls
pivot_acs_wider()
and handles the process of renaming
column titles for a similar effect:
geo_comparison_data |>
gt_acs_compare(
id_cols = "column_title",
column_title_label = "",
currency_value = TRUE
)
Baltimore city, Maryland
|
Baltimore-Columbia-Towson, MD Metro Area
|
Maryland
|
|
---|---|---|---|
Est. | Est. | Est. | |
Median monthly housing costs | $1,275 ± $13 | $1,611 ± $8 | $1,708 ± $6 |
Source: 2018-2022 ACS 5-year Estimates. |
Note that, if you encounter warning or error messages, you may need
to set a value for id_cols
to resolve the
issue.