so e9 sip is on hold for a bit while i kick off e10 tgraph. was looking for another problem to try hadoop with and came across a classic graph one, pagerank. a well understood algorithm like page rank will be a great chance to try pig, the query language that sits on top of hadoop mapreduce.
so we need a graph to work on. my first thoughts were using one of the wikipedia linkage dumps but it feels a bit sterile. instead it's a good excuse to do a little crawl of the following graph of twitter.
this will also be a chance to try to document a project via a blog. skorks' incessant blog rambling has convinced me to give it a go.