trending topics
what does it mean for a topic to be ‘trending’? consider the following time series (430e3 tweets containing cheese collected over a month period bucketed into hourly timeslots)
without a formal definition we can just look at this and say that the series was trending where there was a spike just before time 600. as a start then let’s just define a trend as a value that was greater than was ‘expected’.
how can we calculate trending?
one really nice simple algorithm for detecting a trend is to say a value, v, is trending if v > mean + 3 * standard deviation of the data seen so far. (thanks @peteskomoroch for the suggestion, works a treat)
let’s consider the same time series as before but this time with some overlaid data;
green – the mean
red – minimum trend value ( = mean + 3 * std dev )
blue – instances where the value > minimum trend value

