container build status


If my_address_file_geocoded.csv is a file in the current working directory with coordinate columns named lat, lon, start_date, and end_date, then the DeGAUSS command:

docker run --rm -v $PWD:/tmp my_address_file_geocoded.csv

will produce my_address_file_geocoded_st_census_tract_0.2.1.csv with added columns:

Block Group identifiers are defined as the concatenation of the state, county, tract, and block group fips identifiers (commonly called GISJOIN or GEOID in census data). All census tract identifiers are 11 digits and all census block group identifiers are 12 digits, with the exception of some 1990, 1980, and 1970 tracts that are 9 digits, resulting in 10 digit block group identifiers.

Geomarker Methods

Input data must have columns called lat and lon containing the latitude and longitude, respecitvely, as well as start_date and end_date specifying a date range over which tract-level geomarkers will be assessed. The date range will be used to assign a census tract vintage, ranging from 1970 to 2020 by decade. If you do not have temporal data and wish to use the 2010 tract or block group boundaries, you can utilize the census_block_group DeGAUSS container.

After the vintage is assigned, the latitude and longitude will be overlayed within a tract to assign a census tract identifier from the appropriate decade.

If the date range spans two census decades, the result will contain one row per decade. For example,

id lat lon start_date end_date
1234 39.15852 -84.41757 2019-12-27 2020-01-03

would become

id lat lon start_date end_date census_tract_vintage census_tract_id
1234 39.15852 -84.41757 2019-12-27 2019-12-31 2010 39061005400
1234 39.15852 -84.41757 2020-01-01 2020-01-03 2020 39061027600

where a 2010 tract identifier is assigned to the first row, and a 2020 tract identifier is assigned to the second row.

Geomarker Data

DeGAUSS Details

For detailed documentation on DeGAUSS, including general usage and installation, please see the DeGAUSS homepage.