me on twitter

brain of mat kelcey


how many terms in a trend?

May 11, 2010 at 07:46 PM | categories: e15, trending, algorithms, puzzled | View Comments

i've been poking around with a simple trending algorithm over the last few weeks and have uncovered a problem that, like most interesting ones, i'm not sure how to solve. the question revolves around discovering multi terms trends. a sensible place to start when looking for trends is to consider single terms but what if though we ended up with three equally trending terms 'happy', 'new' and 'year'? it's pretty obvious that the actual trend is 'happy new year' but what is the best way to express this as a single trend in an algorithmic sense?one approach i've been playing...
Read and Post Comments

trending topics in tweets about cheese; part2

May 01, 2010 at 04:54 PM | categories: twitter, trending, e15, pig | View Comments

prototyping in ruby was a great way to prove the concept but my main motivation for this project was to play some more with pig.the main approach will bemaintain a relation with one record per tokenfold 1 hours worth of new data at a time into the modelcheck the entries for the latest hour for any trendsthe full version is on github. read on for a line by line walkthrough!the ruby impl used the simplest approach possible for calculating mean and standard deviation; just keep a record of all the values seen so far and recalculate for each new value.for...
Read and Post Comments

trending topics in tweets about cheese; part1

April 27, 2010 at 11:42 PM | categories: cheese, twitter, trending, e15 | View Comments

what does it mean for a topic to be 'trending'? consider the following time series (430e3 tweets containing cheese collected over a month period bucketed into hourly timeslots)without a formal definition we can just look at this and say that the series was trending where there was a spike just before time 600. as a start then let's just define a trend as a value that was greater than was 'expected'.one really nice simple algorithm for detecting a trend is to say a value, v, is trending if v > mean + 3 * standard deviation of the data seen...
Read and Post Comments

old projects...