A docker container for geocoding, assigning census tract, and deprivation index to addresses
cchmc_batch_geocoderis now out of date. Please consider using our updated geocoder instead. 🛑
This DeGAUSS container condenses the sequence of (1) geocoding street addresses with a custom geocoder based on 2015 TIGER/Line address range files , (2) joining the geocodes to a 2010 census tract shapefile from NHGIS using the epsg:5072 projection, and (3) adding census tract level data from the community deprivation index all into a single image.
To run, navigate to the directory containing a CSV file with a column called
address and call:
docker run --rm=TRUE -v $PWD:/tmp degauss/cchmc_batch_geocoder my_address_file.csv
The container tries to simplify interpretation of the geocoding results with some new columns:
TRUEfor Cincinnati foster & institutional addresses, “foreign”, “verify”, “unknown”, and missing addresses
TRUEif a Post Office (PO) box
TRUEif geocoding result had a precision method of “street” or “range” and a score of > 0.5
FALSE, this means that the address was geocoded but probably not well enough to accurately place it in a census tract. The
lon columns and the corresponding census tract variables (like
dep_index, etc…) for these are set to missing since we cannot accurately place them at a coordinate and in a census tract.
The addresses that are not successfully geocoded are still in the output file, but all moved to the top. This allows for quick examination of these addresses for errors. After edits are made, rerun the container. The successful geocodes are cached locally in a folder called
geocoding_cache so that the geocoding process is never repeated, but instead read from disk. This makes the process of manually editing problematic addresses and rerunning the edited file through the container very quick.
If your address components are in different columns, you will need to paste them together into a single string. Below are some tips that will help optimize geocoding accuracy and precision:
32709) and not “plus four” (i.e.
3333 Burnet Ave Cincinnati 45229 OH)