GIS and Environmental Health

Exercise 4:  Spatial Statistics Lab

Goal:  In this exercise, you will explore a number of different spatial statistics.  You will get some hands on experience with a program called Crimestat (http://www.icpsr.umich.edu/NACJD/crimestat.html), a free program originally designed for the analysis of crime locations (as its name suggests).  The program is great since it reads and writes shapefiles for ArcGIS.  We will consider another hypothetical problem:  racial/ethnic differences in infant mortality in Salt Lake County

Context:  According to the National Center for Health Statistics, Uah has a slightly lower infant mortality rate than the US on average, 5.0 deaths per 1000 live births in Utah versus 6.9 per 1000 for the US in the year 2001.  While this might be a relief to would-be parents living in Utah, there's more to this story that's serious cause for concern.  Nationwide, there is a substantial difference in infant mortality by race/ethnicity.  The risk of a black infant dying is more than twice that of a white infant.  

You will calculate statistics that characterize the location of infant deaths in Salt Lake County in relationship to their underlying racial/ethnic population distributions.

 

STEP 1

Download the following datasets to your computer (remember to use the C:\GeogHealth\<your name> directory as before):

Link to data here: exercise4.zip

Here's a description of the shapefiles:

    Salt Lake County censustracts, 2000
    whiteinfantdeaths - infant death locations in Salt Lake County for white race/ethnicity
    blackinfantdeaths - infant death locations in Salt Lake County for black race/ethnicity

Note: the infant death data is fake data, though county health dept would have similar real data

You will need to uncompress the zip file using the Winzip program.

 

STEP 2

Start up ArcCatalog.  

From the ArcCatalog menu, select File-Connect Folder.  Select the directory where you stored the shapefiles you downloaded.

 

STEP 3

Let's first explore the racial distributions of white and black populations in Salt Lake County.

Start up ArcMap with a new blank map.  Drag the Salt Lake County censustracts shapefile from ArcCatalog into the Table of Contents of ArcMap.  You should see a map of census tracts in Salt Lake County.  

Where do you think most blacks live?  How about whites?  Create a choropleth map for the black population (do you remember how from exercises 1 and 2?).   The attribute field for the black population is called  "Black".  After you've created the choropleth map for blacks, try creating one for whites.  How are they distributed across Salt Lake County?

Let's now compute central tendency (spatial mean) and dispersion statistics (spatial standard deviation and standard deviational ellipse) for the two populations.  We will use the centroids of each census tract to represent "point" estimates for population.  We will do this in Crimestat, but first we need to generate x,y coordinates for the census tract centroids.  Right-click on the Salt Lake County censustracts layer and open the attribute table.  From the attribute table, press "options" and select "add field".  Set the name to "longitude", type=float, precision=10, and scale = 5  (remember this means single precision floating point number -- a number with decimal places that's 10 digits long with 5 digits to the right of the decimal point), and press OK to add a empty field called "longitude" to the attribute table.  Repeat the add field commands to add another field called "latitude".

You should now have two new columns at the end of the table called latitude and longitude.  Right-click the "longitude" heading on the table, and select "calculate values", press "yes" to ignore the warning.  You should get to the field calculator, which allows you to compute the values of the longitude field.  Put a checkmark where says "advanced".  Then copy and paste the following calculation into the "pre-logic VBA script code" window:

Dim dblArea as double
Dim pArea as IArea
Set pArea = [shape]
dblArea = pArea.centroid.x

Then in the window where it says "longitude =", type this:

    dblArea

Then press OK to compute the longitudes.  The stuff you copy and pasted is Visual Basic Applications (VBA) programming code.  ArcGIS has an embedded program language that allows for all sorts of programming trickery.  Here we used it to compute the x location of the centroids for each census tract polygon.  It's a bit beyond the level of the class to expect you to learn VBA in addition to using a GIS, so don't hesitate to ask me for help when you need it.

Repeat the field calculation step for the latitude field.  Except when you enter in the code use a "y" instead of "x" like this:

Dim dblArea as double
Dim pArea as IArea
Set pArea = [shape]
dblArea = pArea.centroid.y

After making the field calculations you should see that the longitude and latitude columns have been filled in with decimal degree.  Let's now export the table to a dbf file that can be read by Crimestat.  While the attribute table is still open, click on options, and select export.  Save the table to a meaningful file name in your GeogHealth folder.  When asked whether you want to add the table to ArcMap, select No.

Now, start CrimestatClick on file crimestat.exe in your data files, so that CrimeStat will run. I am not sure if our lab computer will allow us to do so. If not, save the dbf table in a storage disk, so that you could continue to do so at home. Click past the title screen to get to the main interface.  Notice how the functionality in Crimestat is arranged into tabs.  First you setup your data file, then you can do different analyses: spatial description and/or spatial modeling.

Where it says primary file, press "select files".  Set type to "dbf", and give it the filename for the table you just saved, then press OK.  Then where it asks for  X and Y, set the file to your dbf file, and set column for x to be longitude, and latitude for y.  Then where it asks for Z give it the "White" population field.  That way the spatial mean we're about to calculate will be weighted by the white population. Coordinate system is "spherical" since we are giving angular decimal degrees coordinates.

Then click on the "spatial description" tab at the top.  Under "spatial distribution", put checkmarks next to:  mean center, standard deviational ellipse, morans I and geary's C.
For mean center and standard deviational ellipse, press "save result to".  Save output to "Arcview Shp" and give it a name like mcsd_white and sde_white.  This will tell Crimestat to make shapefiles for the mean center and ellipse for us to view in ArcMap afterwards.

Then click compute.

You should get a text listing of the results:
    Mcsd = mean center, standard deviation
    sde = standard deviational ellipse
    Moran's I and Geary's C

Can you figure out what's the x and y location of the mean center? How about the standard deviation in the x and y directions? What's the moran's I?  What does it mean?  How about Geary's C?

Repeat crimestat for the Black population.  When you save the mean center and standard deviational ellipse, be sure to give it a different name like mcsd_black and sde_black.

Now let's see what these look like in ArcMap.  In ArcCatlog look for the shapefiles you just created in Crimestat.  They should look like:
MCmcsdwhite.shp (mean center for the white population)
MCmcsdblack.shp (mean center for the black population)
SDDmcsdwhite.shp (standard deviation for the white population)
SDDmcsdblack.shp (standard deviation for the black population)
SDEwhite.shp (standard deviational ellipse for the white population)
SDEblack.shp (standard deviational ellipse for the black population)

Drag them into ArcMap's Table of Contents.

Describe the differences between the statistics?  Are the centers located where you thought they would be based on the choropleth maps you saw earlier?

 

STEP 4

Now take a look at the whiteinfantdeaths and blackinfantdeaths shapefiles in ArcMapDo the patterns mortality seem to match their underlying ethnic populations?

Let's compute the same central tendency and dispersion statistics in Crimestat for these cases.

Where it says primary file, press "select files".  Set type to "shp", and give it the filename whiteinfantdeaths, then press OK.  Then where it asks for  X and Y, set the file to whiteinfantdeaths, and set column to X and Y, respectively.  This time, do not give it a Z field since we want Crimestat to use just the locations of the cases, and not weight them by any value.  Coordinate system is "Projected (Euclidean)" since the shapefile is spatially referenced to UTM.

Then compute mean center and standard deviation, and standard deviational ellipses.  Be sure to press "save result to" and save the results to a meaningful shapefile name.  Note that you can not run Moran's I and Geary's C because we are purely looking at location this time.  Hence we're not providing any attribute value in which to consider spatial autocorrelation.

Repeat for blackinfantdeaths.

After you've computed the statistics load the shapefiles into ArcMap.  Compare the statistics to those previously computed for the underlying population.

 

STEP 5

Let's now look at clustering of the mortality cases.  Let's do a Nearest Neighbor Analysis.

Where it says primary file, press "select files".  Set type to "shp", and give it the filename whiteinfantdeaths, then press OK.  Then where it asks for  X and Y, set the file to whiteinfantdeaths, and set column to X and Y, respectively.  This time, do not give it a Z field since we want Crimestat to use just the locations of the cases, and not weight them by any value.  Coordinate system is "Projected (Euclidean)" since the shapefile is spatially referenced to UTM.

If you have anything checked on the "spatial distribution" tab, uncheck them.  Then click on the "distance analysis" tab.  Check mark nearest neighbor, select to do only 1 nearest neighbor (if you entered in a larger number here, you'd be doing a K-th Order Nearest Neighbor analysis which we talked about in the lecture).  Let's not worry about border correction.

Press "compute" to run the statistics.

What's the nearest neighbor result?  does it indicate clustering or not?

Repeat for blackinfantdeaths.

Email me your findings.

Congratulations you are all done!  Enjoy the week!


When you have a chance you might want to go to the Crimestat website listed above, and check out the pdf manual.  It's a great intro primer on spatial stats, with examples (albeit, crime examples).