my_address_file_geocoded.csv is a file in the current working directory with coordinate columns named
lon, then the DeGAUSS command:
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/census_block_group:0.6.0 my_address_file_geocoded.csv
my_address_file_geocoded_census_block_group_0.6.0_2010.csv with added columns:
census_block_group_id_2010: identifier for 2010 block group
census_tract_id_2010: identifier for 2010 tract
The default census year is 2010, but can be changed by supplying an optional argument to the degauss command. For example,
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/census_block_group:0.5.0 my_address_file_geocoded.csv 1990
my_address_file_geocoded_census_block_group_0.6.0_1990.csv, with columns called
Available years for census block group and census tract identifiers include 1990, 2000, 2010, and 2020. Additionally, tracts identifiers are available for 1970 and 1980.
For spatiotemporal data in which each location is associated with a specified date range, consider using the
st_census_tract container, which adds census tract identifiers for the appropriate vintage (1970-2020) based on
end_date for each input location.
Block group shape files were downloaded from nhgis.org and reprojected to EPSG 5072.
All shape files were made valid using
2020 block groups were not yet available via NHGIS, and were downloaded directly from the U.S. Census.
The first 11 characters in a census block group GEOID indicate the census tract, county and state that the block group lies within. The US Census GEOIDs are constructed in a manner that reflects the geographical hierary of the designated area. By using the segments of the GEOID, it is possible to select data based on area types further up in the hierarchy.
|Area Type||GEOID||Number of Digits||Example Area||Example GEOID|
|County||State + County||2+3=5||Hamilton County||39061|
|Census Tract||State + County + Tract||2+3+6=11||Tract 32 in Hamilton County||39061003200|
|Block Group||State + County + Tract +
|2+3+6+1=12||Block Group 1 in Tract 32||390610032001|
Block Group identifiers are defined as the concatenation of the state, county, tract, and block group fips identifiers (commonly called GISJOIN or GEOID in census data). All census tract identifiers are 11 digits and all census block group identifiers are 12 digits, with the exception of some 1990, 1980, and 1970 tracts that are 9 digits, resulting in 10 digit block group identifiers.
block group shapefiles for 1990, 2000, and 2010, as well as tract shapefiles for 1970 and 1980, were obtained from NHGIS and transformed using the
00_make_block_group_shp.R file in this repository.
block group shapefiles for 2020 were obtained directly from the U.S. Census Bureau via
to avoid any geometry evaluation errors (i.e. self-intersecting rings), we used
sf::st_make_valid on all tract and block group polygons
The transformed block group shapefiles are stored at
For detailed documentation on DeGAUSS, including general usage and installation, please see the DeGAUSS homepage.
Steven Manson, Jonathan Schroeder, David Van Riper, Tracy Kugler, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 15.0 [dataset]. Minneapolis, MN: IPUMS. 2020. http://doi.org/10.18128/D050.V15.0