Washington DC and the murderer's work ethic

2012-09-20

Continuing what has turned out to be a fun hobby of looking at crime data for different cities (probably among the most harmless of crime-related hobbies, as long as you aren't taking important decisions based on naive interpretations of badly understood data), I went to data.dc.gov and downloaded Crime Incident data for the District of Columbia for the year of 2011.

Mapping it was the obvious move, but I already did that for Chicago (and Seattle, although there were issues with the data, so I haven't posted anything yet), so I looked at an even more basic dimension: the time series of different types of crime.

To begin with, here's the week-by-week normalized count of thefts (not including burglary, and thefts from cars) in Washington DC (click to enlarge):

I normalized this series by shifting it to its mean and scaling it by its standard deviation — not because the data is normally distributed (it actually shows a thick left tail), but because I wanted to compare it with another data series. After all, the form of the data, partial as it is, suggests seasonality, and as the data covers a year, it wants to be checked against, say, local temperatures.

Thankfully NOAA offers just this kind of data (through about half a dozen confusingly overlapping interfaces), so I was able to add to the plot the mean daily temperature for DC (normalized in the same way as the theft count):

The correlation looks pretty good! (0.7 adj. R squared, if you must know it). Not that this proves any sort of direct causal chain, that's what controlled experiments are for, but we can postulate, e.g., a naive story where higher temperatures mean more foot traffic (I've been in DC in winter, and the neoclassical architecture is not a good match for the latitude), and more foot traffic leads to richer pickings for thieves (an interesting economics aside: would this mean that the risk-adjusted return to crime is high enough that crime is, as it were, constrained by the victims supply?)

Now let's look at murder.

The homicide time series is quite irregular, thanks to a relatively low (for, say Latin American values of 'low') average homicide count, but it's clear enough that there isn't a seasonal pattern to homicides, and no correlation with temperature (a linear fitting model confirms this, not that it was necessary in this case). This makes sense if we imagine that homicide isn't primarily an outdoors activity, or anyway that your likelihood of being killed doesn't increase as you spend more time on the street (most likely, whoever wants to kill you is motivated by reasons other than, say, an argument over street littering). Murder happens come rain or snow (well, I haven't checked that; is there an specific murder weather?)

Another point of interest is the spike of (weather-normalized) theft near the end of the year. It coincides roughly with Thanksgiving, but if that's the causal link, I'd be interested in knowing exactly what's going on.

None

None