Introduction

This is an example of the workflow a PAC3 study site might use to add geomarkers to their data with DeGAUSS.

If you have used DeGAUSS, would you mind providing us some feedback and completing a short survey?

In steps 2 through 6:

Step 0: Install Docker

See the Installing Docker webpage.

Note about Docker Settings:
After installing Docker, but before running containers, go to Docker Settings > Advanced and change memory to greater than 4000 MB (or 4 GiB)

If you are using a Windows computer, also set CPUs to 1.

Click Apply and wait for Docker to restart.

Step 1: Preparing Your Input File

The input file must be a CSV file with one column called address containing all address components. Other columns may be present and will be returned in the output file, but should be kept to a minimum to reduce file size.

An example input CSV file (called my_address_file.csv) might look like:

id address
13100070229 1922 CATALINA AV CINCINNATI, OH 45237
54000600136 5358 LILIBET CT DELHI TOWNSHIP, OH 45238
11200020024 630 GREENWOOD AV CINCINNATI, OH 45229

Refer to the DeGAUSS geocoding webpage for more information about the input file and address string formatting.

Step 2: Navigating the Shell

Open a shell (i.e., terminal on Mac or CMD on Windows). We will use this shell for the rest of the steps in this example.

Navigate to the directory where the CSV file to be geocoded is located. See here for help navigating a filesystem using the command line.

For those unfamiliar with the command line, a simple approach is to save the file to be geocoded to the Desktop, then navigate to your Desktop folder with the command cd Desktop.

Step 3: Geocoding

After navigating to your working directory, use the ghcr.io/degauss-org/geocoder to geocode your addresses.

macOS example call:

docker run --rm -v "$PWD":/tmp ghcr.io/degauss-org/geocoder:3.0.2 my_address_file.csv

Windows (CMD) example call:

docker run --rm -v "%cd%":/tmp ghcr.io/degauss-org/geocoder:3.0.2 my_address_file.csv

Replace my_address_file.csv with the name of the CSV file to be geocoded and run the call in the shell.


Note for Windows Users:
In this and all following docker calls in this example, replace "$PWD" with "%cd%". Refer to the DeGAUSS Troubleshooting page for more information.

See here for more information on the anatomy of a degauss command.

The output file is written to the same directory and in our example, will be called my_address_file_geocoded_v3.0.2.csv.

Example output:

id address start_date end_date matched_street matched_zip matched_city matched_state lat lon score precision geocode_result
54000600136 5358 LILIBET CT DELHI TOWNSHIP OH 45238 2015-05-05 2015-05-06 Lilibet Ct 45238 Delhi Hills OH 39.11552 -84.61902 0.754 range geocoded
13100070229 1922 CATALINA AV CINCINNATI OH 45237 2010-06-07 2010-06-08 Catalina Ave 45237 Cincinnati OH 39.17112 -84.46176 0.922 range geocoded
11200020024 630 GREENWOOD AV CINCINNATI OH 45229 2019-07-08 2019-07-09 Greenwood Ave 45229 Cincinnati OH 39.15321 -84.49236 0.922 range geocoded

For more information on interpreting geocoder output, see here.

Step 4: Deprivation Index

macOS example call:

docker run --rm -v "$PWD":/tmp ghcr.io/degauss-org/dep_index:0.1 my_address_file_geocoded_v3.0.2.csv

Windows (CMD) example call:

docker run --rm -v "%cd%":/tmp ghcr.io/degauss-org/dep_index:0.1 my_address_file_geocoded_v3.0.2.csv

Replace my_address_file_geocoded_v3.0.2.csv with the name of the geocoded CSV file created in Step 3 and run.

The output file is written to the same directory and, in our example, will be called my_address_file_geocoded_v3.0.2_dep_index_v0.1.csv.

Example output:

id address matched_street matched_zip matched_city matched_state lat lon score precision geocode_result fips_tract_id fraction_assisted_income fraction_high_school_edu median_income fraction_no_health_ins fraction_poverty fraction_vacant_housing dep_index
54000600136 5358 LILIBET CT DELHI TOWNSHIP OH 45238 Lilibet Ct 45238 Delhi Hills OH 39.11552 -84.61902 0.754 range geocoded 39061021303 0.0380034 0.9396114 83385 0.0236515 0.0250104 0.0128779 0.2087159
13100070229 1922 CATALINA AV CINCINNATI OH 45237 Catalina Ave 45237 Cincinnati OH 39.17112 -84.46176 0.922 range geocoded 39061006300 0.1149033 0.8787645 38395 0.0391429 0.1641705 0.1284085 0.3569748
11200020024 630 GREENWOOD AV CINCINNATI OH 45229 Greenwood Ave 45229 Cincinnati OH 39.15321 -84.49236 0.922 range geocoded 39061006800 0.3517316 0.8051400 19783 0.0579212 0.3901274 0.2309613 0.5527528

More information on the deprivation index

More information on the dep_index container

Step 5: Drive Time and Distance to Care Center

macOS example call:

docker run --rm -v "$PWD":/tmp ghcr.io/degauss-org/drivetime:1.3.0 my_address_file_geocoded_v3.0.2_dep_index_v0.1.csv cchmc

Windows (CMD) example call:

docker run --rm -v "%cd%":/tmp ghcr.io/degauss-org/drivetime:1.3.0 my_address_file_geocoded_v3.0.2_dep_index_v0.1.csv cchmc

Replace my_address_file_geocoded_v3.0.2_dep_index_v0.1.csv with the name of the CSV file created in Step 4, and replace cchmc with the abbrevation for your care center from this list:

center_name abbreviation
Children’s Hospital of Philadelphia chop
Riley Hospital for Children, Indiana University riley
Seattle Children’s Hospital seattle
Children’s Mercy Hospital mercy
Emory University emory
Johns Hopkins University jhu
Cleveland Clinc cc
Levine Children’s levine
St. Louis Children’s Hospital stl
Oregon Health and Science University ohsu
University of Michigan Health System umich
Children’s Hospital of Alabama al
Cincinnati Children’s Hospital Medical Center - Main Campus cchmc
Cincinnati Children’s Hospital Medical Center - Liberty Campus liberty
Nationwide Children’s Hospital nat
University of California, Los Angeles ucla
Boston Children’s Hospital bch
Medical College of Wisconsin mcw
St. Jude’s Children’s Hospital stj
Martha Eliot Health Center mehc
Northwestern / Ann & Lurie Children’s Northwestern nwu
Lurie Children’s Outpatient Center in Northbrook lcclp
Lurie Children’s Outpatient Center in Lincoln Park lcclp
Lurie Children’s Outpatient Center in Uptown lccu
Dr. Lio’s and Dr. Aggarwal’s clinics lac
Recruited from Eczema Expo 2018 expo
University of California San Francisco Benioff Children’s Hospital ucsf
Nicklaus Children’s Hospital nicklaus
Medical University of South Carolina Children’s Hospital musc
Children’s National Medical Center cnmc
Children’s Hospital of Pittsburgh of UPMC upmc
Methodist LeBonheur Children’s Hospital methodist
Texas Children’s Hospital texas
Arkansas Children’s Hospital arkansas
Primary Children’s Medical Center primary
Children’s Healthcare of Atlanta atlanta
Children’s Medical Center of Dallas dallas
Lucile Packard Children’s Hospital Stanford packard
Toronto Hospital for Sick Children toronto
Cook Children’s Medical Center cook
Children’s Hospital & Medical Center - Omaha omaha
Children’s Hospital Colorado colorado
Arnold Palmer Hospital for Children palmer
Children’s Hospital & Clinics of Minnesota minn
University of Virginia Hospital uva
Joe Dimaggio Children’s Hospital dimaggio
Cohen Children’s Medical Center of New York at Northwell Health  cohen
Dell Children’s Medical Center of Central Texas dell
A.I. duPont Hospital for Children dupont
Rainbow Babies and Children’s Hospital rainbow
UNC Hospitals Children’s Specialty Clinic unc
Barbara Bush Children’s Hospital at Maine Medical maine
Children’s Hospital of New Orleans chnola
Rady Children’s Hospital rady
Children’s Hospital Los Angeles chla
Monroe Carell Jr. Children’s Hospital at Vanderbilt vandy

The output file is written to the same directory and in our example, will be called my_address_file_geocoded_v3.0.2_dep_index_v0.1_drivetime_1.3.0_cchmc.csv.

Example output:

id address matched_street matched_zip matched_city matched_state lat lon score precision geocode_result fips_tract_id fraction_assisted_income fraction_high_school_edu median_income fraction_no_health_ins fraction_poverty fraction_vacant_housing dep_index drive_time distance
54000600136 5358 LILIBET CT DELHI TOWNSHIP OH 45238 Lilibet Ct 45238 Delhi Hills OH 39.11552 -84.61902 0.754 range geocoded 39061021303 0.0380034 0.9396114 83385 0.0236515 0.0250104 0.0128779 0.2087159 30 10219.326
13100070229 1922 CATALINA AV CINCINNATI OH 45237 Catalina Ave 45237 Cincinnati OH 39.17112 -84.46176 0.922 range geocoded 39061006300 0.1149033 0.8787645 38395 0.0391429 0.1641705 0.1284085 0.3569748 18 5004.925
11200020024 630 GREENWOOD AV CINCINNATI OH 45229 Greenwood Ave 45229 Cincinnati OH 39.15321 -84.49236 0.922 range geocoded 39061006800 0.3517316 0.8051400 19783 0.0579212 0.3901274 0.2309613 0.5527528 6 1755.939

More information on drivetime

Step 6: Removing PHI

Before sharing your data, remove the following columns: