brain of mat kelcey...


friend clustering by term usage

June 25, 2010 at 11:39 PM | categories: Uncategorized

recently signed up to the infochimps api and wanted to do something quick and dirty to get a feel for it.

so here's a little experiment

  1. get the people i follow on twitter
  2. look up the words that "represent" them according to the infochimps word bag api
  3. build a similiarity matrix based on the common use of those terms
  4. plot the connectivity for the top 30 or so pairings

it's basically partitioned into three groups...

  1. veztek (my boss john) and smcinnes (steve from the lonely planet community team) in the top right
  2. a big clump of nosqlness with mongodb - hbase - jpatanooga - kevinweil in the bottom left
  3. everyone else

an interesting enough result given the time taken; the codes on github