Posts Tagged ‘r’

moving average of a time series in R

Tuesday, June 15th, 2010

in this a sliding window of 3 elements

> x = c(3,1,4,1,5,9,2,6,5,3,5,8)
> ra_x = filter(x, rep(1,3)/3)
> ra_x
Time Series:
Start = 1
End = 12
Frequency = 1
 [1]       NA 2.666667 2.000000 3.333333 5.000000 5.333333 5.666667 4.333333
 [9] 4.666667 4.333333 5.333333       NA

e11.3 at what time does the world tweet?

Wednesday, October 28th, 2009

consider the graph below which shows the proportion of tweets per 10 min slot of the day (GMT0)

it compares 4.7e6 tweets with any location vs  320e3 tweets with identifiable lat lons

timeslices_freq.comparison

some interesting observations with unanswered questions…

  1. the ebb and flow is not just a result of the time of day for high twitter traffic areas. the reduction between 06:00 and 10:00 comes close to zero. this is false, there is never a worldwide time when internet traffic hits zero. does twitter turn down it’s gatdenhose for capacity reasons?
  2. the number of tweets with lat lons are correlated to those without EXCEPT past 17:00 where the lat lon cases drop drastically. have a couple of ideas banging around my head why this is the case but nothing concrete. any ideas?

speaking of correlation here’s a scatterplot of tweets with lat lons vs without. we can see that time period uncorrelatedness that occurs past 17:00 as a quite obvious cluster.

timeslices_freq.scatter

and here is the R code for these graphs

simple statistics with R

Saturday, October 3rd, 2009

i’m learning a new statistics language called R and it’s pretty cool.

make a vector …

> c(3,1,4,1,5,9,2,6,5,3,5,8)
 [1] 3 1 4 1 5 9 2 6 5 3 5 8

turn it into a frequency table …

> table(c(3,1,4,1,5,9,2,6,5,3,5,8))
1 2 3 4 5 6 8 9
2 1 2 1 3 1 1 1

sort by frequency …

> sort(table(c(3,1,4,1,5,9,2,6,5,3,5,8)))
2 4 6 8 9 1 3 5
1 1 1 1 1 2 2 3

and plot!

> barplot(sort(table(c(3,1,4,1,5,9,2,6,5,3,5,8))))
Rplot

so simple!