The Tool Belt Approach

Firstly, it’s been a while since I’ve blogged.

I’ve been busy.

My wife and I bought a two family home in South Medford, Massachusetts a couple months ago. It’s a nice little place, in a walkable neighborhood with access to transit, and it’s only 4.5 miles from downtown Beantown (no one here calls it Beantown). The home, as they say, is a “fixer-upper” and both units need lots of work. My wife and I have been spending every waking moment doing yard work, rebuilding our first floor unit’s kitchen and bathroom, painting walls, doing demolition in our basement, working with our plumber and electrician as they rewire and replumb the entire home, negotiating the city hall permitting process, Et cetera, Et cetera, Et cetera…

During the past couple months not only have I learned a ton of new homeowners stuff, but I have acquired a ton of new tools (consignment tool shops are the best place ever). Thankfully, I grew-up in a very handy family so I’m not totally in the dark when it comes to home improvement and these tools come in handy. I’ve learned that not every project needs every tool. Before I start a project I scope out what I need to get done, load up my tool belt and get to work. I don’t haul the entire toolbox (or toolboxes) to the project each time.

My tool belt is a wonderful thing. It is lightweight, I only load up what I need for the specific project, and it and forces me to think about my project and make the right planning decisions.

I see so many parallels between my tool belt project approach and what I try to do as a geo-professional.

In the spatial world we often get tied to the idea of the toolbox(es) when working on analysis projects. Toolboxes, whether geo-toolboxes or regular toolboxes, are often full of tools one doesn’t need for a specific project, and sometimes they can be full of tools we use improperly (how many of us have actually used Kriging in the right context or tried to use a flat head screw driver as a chisel?). Without proper planning – planning out a project before you even start – may cause one to use tools in their toolbox incorrectly, perhaps coming to less than a correct conclusion.

We, as geo-professionals, will be much better at what we do if we learn how to solve the problems and answer the questions related the projects we work on first, instead of trying to know how to use every tool in our toolbox. Yes, there will always be the plumbers, contractors, and electricians who have every tool that there could ever be related to their job, just as there will be those all-knowing GIS gurus. However, the vast majority of geo-professionals are those who do other things and not “all GIS, all the time.” I really believe that by using the tool belt approach we can develop a better class of geo-professionals. Understand your problem, do the research to solve it, and then load your tool belt with the proper tools to solve it. And, good, detailed geospatial analysis like good, detailed home improvement never goes as fast as it does on HGTV.

Now, where did I put my hammer?

Spatial Random Sample, Sample

Often, when performing spatial analysis, one may need to execute some type of sampling across space.  For example, one may need to sample locations across a geographically continuous surface (think soils, anything weather related, etc.).  A spatial random sample can be used to select locations without bias.  With a simple python script one can develop a spatial random sample with relative ease.  In this post I will cover a few definitions, provide a code sample, and discuss some additional points.

First, a few definitions:

Random Number: A number chosen as if by chance from some specified distribution such that selection of a large set of these numbers reproduces the underlying distribution.

Statistical Randomness: A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll, or the digits of π exhibit statistical randomness.

Simple Random Sample: A sample in which every element in the population has an equal chance of being selected.

Second, what is a spatial random sample?

Spatial Random Sample: Locations obtained by choosing x-coordinates and y-coordinates at random (p. 58). Any points that do not intersect the landform will be dropped from the list of random points.  

Third, give me some python code to do this!

import os, random
from time import strftime

f = open("C:\\Data\\output\\spatial_random_sample.csv", 'w')

#How many points will be generated
numpoints = random.randint(0,1000)

# Create the bounding box
#set longitude values - Y values
minx = -180
maxx = 180

#set latitude values - X values
miny = -23.5
maxy = 23.5

print "Start Time:", strftime("%a, %d %b %Y %H:%M:%S")
#Print the column headers
print >>f, "ID",",","X",",","Y"
for x in range(0,numpoints):
print >>f, x,",", random.uniform(minx,maxx),",",                      random.uniform(miny,maxy)
f.close()

print "Script Complete, Hooray!", numpoints, "random points generated"
print "End Time:", strftime("%a, %d %b %Y %H:%M:%S")

This quick, dirty and very simple script does a few things. First, it creates a csv file in a local directory, and by using the ‘w’ mode the file will be created if it doesn’t exist and will be overwritten every time the code is run (so be careful).

Next, the code  selects a random number of points to be generated. In this case it will be a random integer between zero and 1,000. The user will then set the bounding box for which the points will be contained by. If using ArcPy and ArcGIS the user could easily set the bounding box to that of a particular layer. In this example, it is simply 180,-180 and the approximate Tropic of Cancer and Tropic of Capricorn.

The next block of code will generate the random number of points in the specified ranges and print them to a csv file.  The output is fairly straight forward: three columns, an ID field and X and Y. The user can open the file in OpenOffice as they could any other csv file.

Well, that’s great.  With this data the user can easily visualize it in Quantum using the Add Delimited Text Layer tool from the Layer menu. Since the output was formatted with X and Y fields the tool will populate itself:

Once the user clicks OK the points will be added to the map.  From there the user can export the data to any number of formats and perform their analysis.

As you can see it is pretty easy to generate random points with the script.  In fact, ArcMap and Quantum have tools that will do this, but both run much slower than just creating a simple spatial random sample as demonstrated here, as they have many more options than this simple script.  Also, the Arc version will only work if the user has ArcEditor or the spatial analyst extension.  The folks at SpatialEcology also have a tool that will do this within ArcMap as well, and I am sure there are other tools out there as well.

But before we wrap this up, here are a couple notes:

  • This is a simple example, and not intended to be an “end-all, be-all example”.
  • Python generates psuedo-random values
  • The points that are generated have an equal chance of being created, meaning that whatever is being sampled with those coordinates has an equal chance of being selected as well.
  • The script presented here does not check against any boundaries, only a bounding box.
  • The above code can easily be extended to work within ArcPy and ArcGIS.  I can post the code later on if there is interest.

GISDoctor Spatial Analysis Post Series

There once was a well know GIS blog post that compared geographic information systems to word processors.  No matter what you think about the post we will always need people who are skilled at “writing” and have something to “write” about.

As I have said before, and will say again, if you are using GIS technologies you should have a grasp on the fundamentals.  You wouldn’t write a paper or a report without a grasp on the basics of the topic or without a knowledge of writing in general.  So, to improve the world’s GIS grammar (or at least my own), I will be posting a number of spatial analysis related topics over the course of the next few months.  Here are a few of the topics I will cover:

  • Data classification schemes
  • Understanding spatial random samples
  • Topology, from a spatial point of view
  • The basics of projections
  • Avoiding false accuracy
  • Using root mean square
  • Geary’s c and Moran’s I
  • The First Law of Geography
  • Spatial autocorrelation
  • and many more…

I’ll use a variety of software, data, and problems to explain these topics, in order to expose the reader to the broad language of GIS.

Great story in the New York Times

In case you missed it there was a great story in the New York Times on Tuesday in regards to the use of GIS in historical analysis. The article, Digital Maps are Giving Scholars the Historical Lay of the Land” , by Patricia Cohen, discusses the evolution of the spatial humanities and historical GIS/geography, which are growing disciplines in the humanities at colleges and universities around the country.

The article provides a nice overview of how historians, archaeologists, and other non-geographers have embraced spatial analysis and GIS in their research.  I think this is a great article on a trend in the humanities that has been growing for years.  I remember as an undergrad ten years ago developing GIS tools to visualize historical settings.  In grad school I routinely helped non-geographers develop spatial analysis methodologies and visualization techniques to process and analyze historical GIS data.  Much of that work ended up in scholarly publications.  The spatial component really gave the authors an edge over other papers at that time.

So, if you get a few minutes check the article out.  Anytime that GIS gets mentioned in the New York Times is great for our field!

A couple of notes from the article:

  • The author references www.gis.com.  I’m sure the marketing department at Esri liked the link.
  • The article links to David Rumsey’s site.  If you are a map junkie like myself you will love this site.  An amazing map collection.  This site has really influenced the development of online map libraries around the world.
  • I wonder if the growth of GIS and spatial analysis in the humanities (which has been happening for a number of years) has increased enrollment and/or developed programs in GIS and geography at schools where the spatial humanities are strong.  The AAG should get on this.
  • Does anyone remember the digital landscape history of Manhattan that was put together a couple of years ago?  The project gained some press and buzz when it came out.  I’m surprised this article didn’t mention it.  Oh well…