--- title: "Introduction to getACS" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to getACS} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message = FALSE, comment = "#>" ) ``` ```{r setup} library(getACS) library(dplyr) library(ggplot2) theme_set(theme_minimal()) ``` This package is designed to build on the `{tidycensus}` package to make it easier to create reproducible reports by allowing users to: - Download multiple tables with `get_acs_tables()` - Download multiple geographies and tables with `get_acs_geographies()` - Download multiple years and tables with `get_acs_ts()` All of these functions use the `label_acs_metadata()` to join the variables to detailed pre-computed [table and column metadata available on GitHub](https://github.com/censusreporter/census-table-metadata) from the open-source [Census Reporter project](https://censusreporter.org/). This metadata includes a parent column ID that supports the conversion of estimate and margin of error values into a percent share of the corresponding total. The function also uses the `race_iteration` reference to add a name of the racial or ethnic group as a column where appropriate. ## Using `get_acs_tables()` Get the race iteration tables for a tenure status table using a county level geography: ```{r} tenure_tables <- acs_table_race_iteration("B25003")[c(2, 3, 10)] tenure_data <- get_acs_tables( geography = "county", state = "MD", county = "Baltimore city", table = tenure_tables ) ``` Filter out the totals and create a simple bar chart to compare the data across tables: ```{r} tenure_data |> filter_acs(indent > 0) |> select_acs(race_iteration_group) |> ggplot( aes( x = column_title, y = perc_estimate, fill = race_iteration_group ) ) + geom_col(alpha = 0.8, position = "dodge") + scale_y_acs_percent("% of households") + scale_fill_viridis_d("Race/ethnic group") + labs_acs_survey(table = tenure_tables, x = "Tenure") ``` ## Using `get_acs_geographies()` Get both county and state level geography for a population table: ```{r} multigeo_acs_data <- get_acs_geographies( geography = c("county", "state"), state = "MD", county = "Baltimore city", table = "B01003", quiet = TRUE ) ``` Drop the column title (the table contains only a single variable) and use the `rowname_col` parameter to group by the name of the area. `fmt_acs_county()` is a helper function to strip the text ", Maryland" from the end of the name of "Baltimore city". ```{r} multigeo_acs_data |> select_acs(.column_title_col = NULL) |> gt_acs( rowname_col = "NAME", value_label = "Population", table = "B01003" ) |> fmt_acs_county(state = "Maryland") ``` The `fmt_acs_county()` function can work with both `gt_tbl` and data frame inputs (although support for data frames may be moved into a seprate function in the future). The `gt_acs_compare()` helps to transform the input data frame and place columns for different geographies or named areas in a side-by-side configuration: ```{r} multigeo_acs_data |> select_acs() |> fmt_acs_county(state = "Maryland") |> gt_acs_compare( value_label = "", column_title_label = "", table = "B01003" ) ``` ## Using `get_acs_ts()` `get_acs_ts()` relies on a helper, `acs_survey_ts()`, that identifies the non-overlapping, comparable sample years for a specific ACS sample: ```{r} years <- acs_survey_ts("acs5", 2021) years ``` The `get_acs_ts()` calls `acs_survey_ts()` internally to return data for multiple years (warning you if a variable is unavailable for a specific year or geography): ```{r} acs_ts_data <- get_acs_ts( geography = "county", state = "MD", survey = "acs5", year = 2022, table = "B01003", quiet = TRUE ) glimpse(acs_ts_data) ``` Helper functions for `{ggplot2}` include `scale_x_acs_ts()` to set appropriate breaks: ```{r} acs_ts_data |> filter_acs( GEOID %in% c("24510", "24005", "24003", "24027", "24025", "24035") ) |> fmt_acs_county(state = "Maryland") |> ggplot(aes(x = year, y = estimate, color = NAME)) + geom_point(size = 2) + geom_line(linewidth = 1) + scale_y_acs_estimate("Population", limits = c(0, 1000000)) + scale_x_acs_ts(survey = "acs5", year = 2022) + scale_color_brewer("County", type = "qual", palette = 3) + theme(legend.position = "bottom") ```