brain of mat kelcey
dimensionality reduction using random projections.
May 10, 2011 at 08:31 PM | categories: machine learning | View Comments
previously i've discussed dimensionality reduction using SVD and PCA but another interesting technique is using a random projection.in a random projection we project A (a NxM matrix) to A' (a NxO, O < M) by the transform AP=A' where P is a MxO matrix with random values.( well not totally random, each column must have unit length (ie entries in each column must add to 1) )though the results of this reduction are generally not as good as the results from SVD or PCA it has two huge benefitscan be done without needing to hold P in memory (since it's...
my list of cool machine learning books
August 06, 2010 at 06:35 PM | categories: books, machine learning | View Comments
for the last month or so i've had my head down and have been focusing more on theory (ie reading) than on practice (ie coding)so rather than write no blog post here's mats-list-of-cool-machine-learning-books in the order i think you should consider reading them...moreif you know nothing about machine learning and haven't done maths since high school then this is the book for you.it's a fantastically accesible introduction to the field. includes almost no theory and explains algorithms using actual python implementations.this book covers quite a bit more than programming c.i. while still being extremely practical (ie very few formula).about a...
brutally short intro to weka
July 03, 2010 at 05:35 PM | categories: weka, brutally short intro, machine learning | View Comments
weka is a java based machine learning workbench that i've found useful to playing with to help understand some standard machine learning algorithms. in this quick demo i show how to build a classifier for three simple datasets; two of which address the basics of text classificationbrutally short intro to weka from Mat Kelcey on Vimeo....
an intro to semi supervised document classification
January 31, 2010 at 02:02 PM | categories: semi supervised, naive bayes, machine learning | View Comments
here's a great lecture from tom mitchell about document classification using a semi supervised version of naive bayes.semi supervised algorithms only require some of the training examples to be labeled and are able to make use of any unlabelled ones, very common when we have a huge corpus.i've started an experiment brewing to test this out by porting some previous naive bayes work i did to use this semi supervised scheme and will published it when it's done.cool stuff!!...
do a degree via youtube
October 01, 2009 at 08:40 PM | categories: lectures, statistics, stanford, machine learning | View Comments
i'm amazed by how much great content is on youtube, how could you NOT learn something!?13 x 1hr Statistical Aspects of Data Mining (Stats 202)20 x 1hr Machine Learning...
old projects...
- latent semantic analysis via the singular value decomposition (for dummies)
- semi supervised naive bayes
- statistical synonyms
- round the world tweets
- decomposing social graphs on twitter
- do it yourself statistically improbable phrases
- should i burn it?
- the median of a trillion numbers
- deduping with resemblance metrics
- simple supervised learning / should i read it?
- audioscrobbler experiments
- chaoscope experiment