Title: | Easy Access to IPUMS Data |
---|---|
Description: | A package with helper functions extending the ipumsr package for accessing NHGIS and other IPUMS data sources. |
Authors: | Eli Pousson [aut, cre, cph] |
Maintainer: | Eli Pousson <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-21 16:27:25 UTC |
Source: | https://github.com/elipousson/ipumseasyr |
ipumsr::define_extract_nhgis
define_nhgis_ts_extract()
is a wrapper for ipumsr::define_extract_nhgis()
with defaults that support the creation of tidy data using
read_nhgis_data()
or pivot_nhgis_data()
.
define_nhgis_ts_extract( year = NULL, tables = NULL, geography = c("county", "state"), extent = "us", output = c("tidy", "wide", "file"), shape_year = NULL, basis = 2008, geometry = FALSE, ..., time_series_tables = NULL, description = NULL, shapefiles = NULL, data_format = "csv_no_header", validate = TRUE, api_key = Sys.getenv("IPUMS_API_KEY") )
define_nhgis_ts_extract( year = NULL, tables = NULL, geography = c("county", "state"), extent = "us", output = c("tidy", "wide", "file"), shape_year = NULL, basis = 2008, geometry = FALSE, ..., time_series_tables = NULL, description = NULL, shapefiles = NULL, data_format = "csv_no_header", validate = TRUE, api_key = Sys.getenv("IPUMS_API_KEY") )
output |
Used to set |
geometry |
If |
... |
Arguments passed on to
|
time_series_tables |
List of time series table specifications for any
time series tables
to include in the extract request. Use |
description |
Description of the extract. |
shapefiles |
Names of any shapefiles to include in the extract request. |
data_format |
The desired format of the extract data file.
Note that by default, Required when an extract definition includes any |
api_key |
API key associated with your user account. Defaults to the
value of the |
ipumsr::wait_for_extract
and
ipumsr::download_extract
download_ipumsr_extract()
is a wrapper for ipumsr::wait_for_extract()
and
ipumsr::download_extract()
to wait until an extract is ready for download
before attempting to download it.
download_ipumsr_extract( extract = NULL, download_dir = getwd(), overwrite = FALSE, progress = TRUE, ..., api_key = Sys.getenv("IPUMS_API_KEY") )
download_ipumsr_extract( extract = NULL, download_dir = getwd(), overwrite = FALSE, progress = TRUE, ..., api_key = Sys.getenv("IPUMS_API_KEY") )
extract |
One of:
For a list of codes used to refer to each collection, see
|
download_dir |
Path to the directory where the files should be written. Defaults to current working directory. |
overwrite |
If |
progress |
If |
... |
Arguments passed on to
|
api_key |
API key associated with your user account. Defaults to the
value of the |
Download extract with download_ipumsr_extract()
and return a list of file
paths for the data and shape files.
get_ipumsr_extract_paths( extract = NULL, data_file = NULL, shape_file = NULL, submit_extract = TRUE, download_extract = TRUE, download_dir = getwd(), overwrite = FALSE, progress = TRUE, refresh = FALSE, api_key = Sys.getenv("IPUMS_API_KEY") )
get_ipumsr_extract_paths( extract = NULL, data_file = NULL, shape_file = NULL, submit_extract = TRUE, download_extract = TRUE, download_dir = getwd(), overwrite = FALSE, progress = TRUE, refresh = FALSE, api_key = Sys.getenv("IPUMS_API_KEY") )
extract |
An |
submit_extract |
If |
download_dir |
Path to the directory where the files should be written. Defaults to current working directory. |
overwrite |
If |
progress |
If |
api_key |
API key associated with your user account. Defaults to the
value of the |
A named list with "data" and "shape" elements containing extract file paths.
Use define_nhgis_ts_extract()
, ipumsr::submit_extract()
,
ipumsr::download_extract()
, and read_nhgis_files()
to define, submit,
download, and read a NHGIS time series extract. This function is only
recommended for interactive use and is not recommended if you are
requesting a large number of tables or geographies.
get_nhgis_ts_data( year = NULL, tables = NULL, geography = c("county", "state"), extent = "us", output = c("tidy", "wide", "file"), basis = 2008, shape_year = NULL, geometry = FALSE, extract = NULL, data_file = NULL, shape_file = NULL, state = NULL, ..., time_series_tables = NULL, description = NULL, shapefiles = NULL, data_format = "csv_no_header", validate = TRUE, submit_extract = TRUE, download_extract = TRUE, read_files = TRUE, download_dir = getwd(), overwrite = FALSE, progress = TRUE, verbose = progress, api_key = Sys.getenv("IPUMS_API_KEY") )
get_nhgis_ts_data( year = NULL, tables = NULL, geography = c("county", "state"), extent = "us", output = c("tidy", "wide", "file"), basis = 2008, shape_year = NULL, geometry = FALSE, extract = NULL, data_file = NULL, shape_file = NULL, state = NULL, ..., time_series_tables = NULL, description = NULL, shapefiles = NULL, data_format = "csv_no_header", validate = TRUE, submit_extract = TRUE, download_extract = TRUE, read_files = TRUE, download_dir = getwd(), overwrite = FALSE, progress = TRUE, verbose = progress, api_key = Sys.getenv("IPUMS_API_KEY") )
output |
Used to set |
geometry |
If |
extract |
An |
data_file |
Path to a .zip archive containing an NHGIS extract or a single file from an NHGIS extract. |
shape_file |
Path to a single .shp file or a .zip archive containing at least one .shp file. See Details section. |
time_series_tables |
List of time series table specifications for any
time series tables
to include in the extract request. Use |
description |
Description of the extract. |
shapefiles |
Names of any shapefiles to include in the extract request. |
data_format |
The desired format of the extract data file.
Note that by default, Required when an extract definition includes any |
download_dir |
Path to the directory where the files should be written. Defaults to current working directory. |
overwrite |
If |
progress |
If |
verbose |
Logical controlling whether to display output when loading
data. If Will be overridden by |
api_key |
API key associated with your user account. Defaults to the
value of the |
join_nhgis_percent_change()
joins a percent change column relative to a
reference year. Optionally join a rank from the reference year using
dplyr::ntile()
.
join_nhgis_percent_change( data, reference_year = NULL, value_col = "value", reference_prefix = "reference_", variable_col = "variable", year_col = "YEAR", rank_col = "rank", rank = NULL, rank_n = NULL, rank_by = NULL, ..., perc_prefix = "perc_change_", digits = 2 )
join_nhgis_percent_change( data, reference_year = NULL, value_col = "value", reference_prefix = "reference_", variable_col = "variable", year_col = "YEAR", rank_col = "rank", rank = NULL, rank_n = NULL, rank_by = NULL, ..., perc_prefix = "perc_change_", digits = 2 )
reference_year |
Reference year to use when calculating a percent change column. |
rank , rank_n
|
Passed to |
rank_by |
Used as |
labs_nhgis()
adds a standard credit caption for NHGIS data to make consistent attribution easier.
labs_nhgis( ..., caption = NULL, credit = "IPUMS NHGIS, University of Minnesota, www.nhgis.org.", prefix = "Source: ", collapse = " ", width = 80 )
labs_nhgis( ..., caption = NULL, credit = "IPUMS NHGIS, University of Minnesota, www.nhgis.org.", prefix = "Source: ", collapse = " ", width = 80 )
... |
Arguments passed on to
|
credit |
Credit line for IPUMS. |
collapse |
String to collapse caption and credit. Defaults to |
width |
Maximum width of caption line passed to |
ipumsr::get_metadata_nhgis
Use ipumsr::get_metadata_nhgis()
with type = "time_series_tables"
to
return a data frame of time series tables. Optionally filter by geographical
integration type "nominal" or "standardized" ("2010" or "standardized to
2010" also work).
list_nhgis_ts_tables( ..., cache = TRUE, cache_file = "nhgis_time_series_tables.rds", refresh = FALSE, integration = NULL )
list_nhgis_ts_tables( ..., cache = TRUE, cache_file = "nhgis_time_series_tables.rds", refresh = FALSE, integration = NULL )
... |
Additional parameters passed to |
refresh |
If |
integration |
Optional filter for geographical integration. |
A vector of NHGIS time series table names named with table descriptions.
nhgis_ts_tables
nhgis_ts_tables
A character vector with 389 time series table names.
ipumsr::read_ipums_sf
Read IPUMS geometry using ipumsr::read_ipums_sf
read_ipums_geometry( shape_file = NULL, path = NULL, file_select = NULL, vars = "GISJOIN", encoding = NULL, bind_multiple = TRUE, add_layer_var = NULL, verbose = FALSE )
read_ipums_geometry( shape_file = NULL, path = NULL, file_select = NULL, vars = "GISJOIN", encoding = NULL, bind_multiple = TRUE, add_layer_var = NULL, verbose = FALSE )
shape_file |
Path to a single .shp file or a .zip archive containing at least one .shp file. See Details section. |
file_select |
If |
vars |
Names of variables to include in the output. Accepts a
character vector of names or a tidyselect selection.
If |
encoding |
Encoding to use when reading the shape file. If |
bind_multiple |
If |
add_layer_var |
If The column name will always be prefixed with |
verbose |
If |
Read NHGIS data and geometry to return a named list or a combined sf
object.
read_nhgis_files( path = NULL, data_file = NULL, data_file_select = NULL, shape_file = NULL, shape_file_select = NULL, verbose = FALSE, geometry = FALSE, ... )
read_nhgis_files( path = NULL, data_file = NULL, data_file_select = NULL, shape_file = NULL, shape_file_select = NULL, verbose = FALSE, geometry = FALSE, ... )
path |
Optional if |
data_file |
Path to a .zip archive containing an NHGIS extract or a single file from an NHGIS extract. |
data_file_select , shape_file_select
|
Passed to |
shape_file |
Path to a single .shp file or a .zip archive containing at least one .shp file. See Details section. |
verbose |
Logical controlling whether to display output when loading
data. If Will be overridden by |
A named list with "data" and "shape" elements or a combined sf data frame.
Reference data with U.S. state names, USPS abbreviations, and Census divisions, and regions. Includes 50 U.S. States and the District of Columbia.
usa_states
usa_states
A data frame with 51 rows and 4 variables:
STATE
State name
STUSPS
State USPS abbreviation
division
U.S. Census Division name
region
U.S. Census Region name