Latent mini-clusters of movies

2013-07-24

Still playing with logical itemset mining, I downloaded one of the data sets from Group Lens that records movie ratings from MovieLens. The basic idea is the same as with clustering drug side effects: movies that are consistently ranked similarly by users are linked, and clusters in this graph suggest "micro-genres" of homogeneous (from a ratings POV) movies.

Here are a few of the clusters I got, practically with no fine-tuning of parameters:

As movie clusters go, these are not particularly controversial; I found it interesting how originals and sequels or remakes seemed to be co-clustered, at least superficially. And thinking about it, clustering Apt Pupil with both Lord of the Flies movies is reasonable...

Media recommendation is by now a relatively mature field, and no single, untuned algorithm is going to be competitive against what's already deployed. However, given the simplicity and computational manageability of basic clustering and recommendation algorithms, I expect they'll become even more ubiquitous over time (pretty much as how autocomplete in input boxes did).