brain of mat kelcey...


what to do with a week off?

February 22, 2010 at 06:42 PM | categories: Uncategorized

this week i'm between jobs so i have (a little) more time than usual to hack.

i've got a list of pending things to do but can't decide what to do next, here's my list in (sort of) priority order...

  • fix up my numerical underflow / overflow problems in my recent semi supervised classification project.
  • work through the exerecises from the first few chapters to introductory statistics with r and all of statistics. i'm particularly keen to write a intro stats blog post about statistical signifigance.
  • do this mongdb tute i found; shouldn't take too long.
  • do a weka screencast. i did some little talks at work lately about weka and they seemed to be interesting enough to others that it might be worth doing a screencast on it.
  • do some work on modelling of periodic functions. seemed like trending topics is an interesting area at the moment and this would be a good chance to learn some more about R. fourier series look like a potential solution. there is also some interesting stuff to do in this area around majority evaluation from a stream of data.
  • finish my work on detecting resemblance with hadoop. something that's been hanging over my head for about 2 years is the first piece of work i did that led me onto hadoop. i've had a long running project on resemblance that ended up with me writing a map/reduce framework in erlang (until i (re)discovered hadoop).
  • revisit mahout, it's looking a bit more polished nowadays.
  • redo and finish my project on latent semantic analysis; need to include some comparison work with probabilistic latent semantic analysis and latent dirichlet allocation (which is close to winning the scariest-formulas-on-a-wikipedia-page award)
  • finish my twitter classifier; haven't work on it since lists were introduced and i think they would be an interesting addition to the algorithm.

decisions, decisions....