Chicago and the Tree of Crime

2012-08-27

After playing with a toy model of surveillance and surveillance evasion, I found the City of Chicago's Data Portal, a fantastic resource with public data including the salaries of city employees, budget data, the location of different service centers, public health data, and quite detailed crime data since 2001, including the relatively precise location of each reported crime. How could I resist playing with it?

To simplify further analysis, let's quantize the map into a 100x100 grid. Here's, then, the overall crime density of Chicago (click to enlarge):

This map shows data for all crime types. One first interesting question is whether different crime types are correlated. E.g., do homicides tend to happen close to drug-related crimes? To look a this, I calculated the correlation between the different types of crimes at the same point of the grid, and from that I built a "tree of crime." Technically called a dendogram, this kind of plot is akin to a phylogenetic tree, and in fact it's often used to show evolutionary relationships. In this case, the tree shows the closeness or not, in terms of geographical correlation, between types of crimes: the closer two types of crime are in the tree, the more likely they are to happen in the same geographical area (click to enlarge).

A few observations:

For law enforcement — as for everything else — data analysis is not a silver bullet, and pretending it is can lead to shooting yourself in the face with it (the mixed metaphor, I hope, is warranted by the topic). But it can serve as a quick and powerful way to pose questions and fight our own preconceptions, and, perhaps specially in highly emotional issues like crime, that can be a very powerful weapon.