for the last month or so i've had my head down and have been focusing more on theory (ie reading) than on practice (ie coding)
so rather than write no blog post here's mats-list-of-cool-machine-learning-books in the order i think you should consider reading them...
1) "programming collective intelligence" by toby segaran
|if you know nothing about machine learning and haven't done maths since high school then this is the book for you.
it's a fantastically accesible introduction to the field. includes almost no theory and explains algorithms using actual python implementations.
2) "data mining" by witten and frank
|this book covers quite a bit more than programming c.i. while still being extremely practical (ie very few formula).
about a fifth of the book is dedicated to weka, a machine learning workbench which was written by the authors.
apart from the weka section this book has no code. i made a little screencast on weka
awhile back if you're after a summary.
3) "introduction to data mining" by tan, steinbach and kumar
|covers almost the same material as the witten/frank text but delves a little bit deeper and with more rigour.
includes no code (none of the books do from now on) with algorithms described by formula. has a number of appendices
on linear algebra, probability, statistics etc so that you can read up if you're a bit rusty or new to the
fields (the witten/frank text lack these). some people might argue having both of these books is a waste since
they cover so much of the same ground but i've always found multiple explanations from different authors to be
a great way to help understand a topic. i read the witten/frank text first and am glad i did but
if i could only keep one i'd keep this one.
at this point you've probably got enough mental firepower to handle some of the uni level machine
learning course notes that are floating about online. if you're keen to get a better foundation of the maths
side of things it'd be worth working through andrew ng's
lecture series on machine learning. (20 hours of a second year stanford course on machine learning) i also
found andrew moore's lecture slides really great.
(they do though require a reasonable understanding of the basics)
4) "foundations of statistical natural language processing" by manning and schutze
|not a machine learning book as such but great for learning to deal with one of the most common types of data
around; text. since most of machine learning theory is about maths (ie numbers) this is awesome in
helping to understanding how to deal with text in a mathematical context.
5) "introduction to machine learning" by ethem alpaydin
|covers generally the same sort of topics as the data mining books but with much more rigour and theory
(derivations, proofs, etc). i think this is a good thing though since understanding how things work at a low level
gives you the ability to tweak and modify as required. loads more formulas but again with appendixs that introduce
the basics in enough detail to get by.
6) "all of statistics" by larry wasserman
|by this stage you'll probably have an appreciation of how important statistics is for this domain and it
might be worth foccussing on it for a bit. personally i found this book to be a great read and though
i've only read certain sections in depth i'm looking forward to when i get a chance to work through
it cover to cover
7) "the elements of statistical learning" by hastie, tibshirani and friedman.
|with a bit more stats under your belt you might have a chance of getting through this one; the most complex of the lot.
this book is absolutely beautifully presented and now that it's
FREE to download you've got no reason not
to have a crack at it. a remarkable piece of work and one i've yet to get through fully cover to cover,
it's quite hardcore and right on the border of my level of understanding ( which makes it perfect for me :P )
ps. books i haven't read that are in the mail
"machine learning" by tom mitchell
|have been wanting to read this one for awhile, i'm a big fan of tom mitchell, but couldn't justify the cost
however just found out the other day the paperback is a third of the price of the hardback i was looking at!! the book's in the mail
"pattern recognition and machine learning" by chris bishop
|all of a sudden seemed like everyone was reading this but me so it was time to jump on the bandwagon