my nerd blog
e14: latent semantic analysis via the singular value decomposition (for dummies)
introduction
example 1: two clear features
example 2: two less clear features
example 3: two less clear features (revisited)
using real data
using real data (with document length normalisation)
conclusions
e13: semi supervised naive bayes
what is a semi supervised algorithm?
v1: a semi supervised version of naive bayes
does it do any better?
v2: rewriting for scale
v2: does it do any better?
e12: statistical synonyms
what are statistical synonyms?
e11: round the world tweets
tweets around the world
from bash scripts to hadoop
aggregating tweets by time of day
at what time does the world tweet?
e10: decomposing social graphs on twitter
introducing tgraph
crawling twitter
tgraph crawl order example
twitter crawl progress
communities in social graphs
community detection for my local twitter network
e9: do it yourself statistically improbable phrases
introduction
take 1: trigram frequency
take 2: term frequency
take 3: markov chains
part 4: but does it scale?
e7: should i burn it?
the problem
brute force
random walk
e6: the median of a trillion numbers
the question
the base algorithm
distributing
generating test data
ruby implementation
brutally short introduction to erlang
single process erlang implementation
multiple process erlang implementation
performance comparisons
running on amazon ec2
conclusion
e5: deduping with resemblance metrics
the jaccard coefficient
fastmap projection using jaccard distances
simhash and sketching
e4: rss feed stuff / should i read it?
the m-algorithm
considering word occurrences
the naive bayes method
multinomial bayes
markov chains
e3: audioscrobbler experiments
bands like the bands i like
the distance from celion dion to napalm death
multi dimensional tag distance
e2: chaoscope experiment
e1: particle swarm optimisation experiment
ye olde pics
2008 03 espen
2006 06 around the world (east to west)
2004 12 wedding
2004 09 skydiving
2004 01 the northern territory
2003 01 kyoto
2002 12 pool party
2002 07 old arcade games
2002 04 johnny walker visits lambert ave
2002 02 three uni students in adelaide
2001 12 new kitten
2001 12 around the world (west to east)
2001 09 top secret rocket experiments
2000 pre perth poker phase page
1999 tokyo
1996 usa
other
oh the places you'll go
my nerd blog
me on twitter
me on github
me on youtube
me on facebook
send me an email! matthew.kelcey at gmail dot com