brain of mat kelcey...

my updated list of cool machine learning books

November 01, 2020 at 09:40 PM | categories: Uncategorized

it's been ten years so it's probably time to update my list of cool machine learning books.

my most meta blog ever

June 09, 2019 at 01:40 PM | categories: Uncategorized

...

simple tensorboard visualisation for gradient norms

June 27, 2017 at 09:45 PM | categories: Uncategorized

some cook book examples of gradient norm visualisation in tensorboard.

after 2,350 days in america we are moving home...

June 14, 2017 at 10:00 PM | categories: Uncategorized

i'm leaving google brain and we're moving back to australia.

brutally short intro to theano word embeddings

March 28, 2015 at 01:00 PM | categories: Uncategorized

one thing in theano i couldn't immediately find examples for was a simple embedding lookup table, a critical component for anything with NLP. turns out that it's just one of those things that's so simple no one bothered writing it...

hallucinating softmaxs

March 15, 2015 at 10:00 PM | categories: Uncategorized

language modelling is a classic problem in NLP; given a sequence of words such as "my cat likes to ..." what's the next word? this problem is related to all sorts of things, everything from autocomplete to speech to text.the...

theano and the curse of GpuFromHost

February 22, 2015 at 10:00 PM | categories: Uncategorized

i've been reviving some old theano code recently and in case you haven't seen it theano is a pretty awesome python library that reads a lot like numpy but provides two particularly interesting features.symbolic differentiation; not something i'll talk about...

dead simple pymc

December 27, 2012 at 09:00 PM | categories: Uncategorized

PyMCis a python library for working withbayesian statistical models,primarily usingMCMCmethods. as a software engineer who has only just scratched the surface of statistics this whole MCMCbusiness is blowing my mind so i've got to share some examples.let's start with the...

smoothing low support cases using confidence intervals

December 08, 2012 at 10:50 PM | categories: Uncategorized

say you have three items; item1, item2 and item3 and you've somehow associated a count for each against one of five labels; A, B, C, D, E> data A ...

item similarity by bipartite graph dispersion

August 20, 2012 at 08:00 PM | categories: Uncategorized

the basis of most recommendation systems is the ability to rate similarity between items. there are lots of different ways to do this. one model is based the idea of an interest graph where the nodes of the graph are...

finding names in common crawl

August 18, 2012 at 08:00 PM | categories: Uncategorized

the central offering from common crawl is the raw bytes they've downloaded and, though this is useful for some people, a lot of us just wantthe visible text of web pages. luckily they've done this extraction as a part of...

fuzzy jaccard

July 31, 2012 at 08:00 PM | categories: Uncategorized

the jaccard coefficient is one of the fundamental measures for doing set similarity. ( recall jaccard(set1, set2) = |intersection| / |union|. when set1 == set2 this evaluates to 1.0 and when set1 and set2 have no intersection it evaluates to...

ggplot posixct cheat sheet

March 18, 2012 at 08:00 PM | categories: Uncategorized

after having to google this stuff three times in the last few months i'm writing it down here so i can just cut and paste next time...> d = read.delim('data.tsv',header=F,as.is=T,col.names=c('dts_str','freq'))> # YEAR MONTH DAY HOUR> head(d,3) ...

collocations in wikipedia, part 1

January 01, 2012 at 08:00 PM | categories: Uncategorized

hmmm. did you mean collocations in wikipedia?...

tokenising the visible english text of common crawl

December 10, 2011 at 04:00 PM | categories: Uncategorized

Common crawl is a publically available 30TB web crawl taken between September 2009 and September 2010. As a small project I decided to extract and tokenised the visible text of the web pages in this dataset. All the code to...

finding phrases with mutual information

November 15, 2011 at 11:00 PM | categories: Uncategorized

continuing on with my series of mutual information experiments how might we extend the technique to identity sequences longer than just two terms?one novel way is to identify the bigrams of interest, replace them with a single token and simply...

collocations in wikipedia, part 2

November 05, 2011 at 05:00 PM | categories: Uncategorized

in my last post we went through mutual information as a way of finding collocations.the astute reader may have noticed that for the list of top bigrams i onlyshowed ones that had a frequency above 5,000.why this cutoff? well it...

collocations in wikipedia, part 1

October 19, 2011 at 08:00 PM | categories: Uncategorized

collocations are combinations of terms that occur together more frequently thanyou'd expect by chance. they can include proper noun phrases like 'Darth Vader'stock/colloquial phrases like 'flora and fauna' or 'old as the hills'common adjectives/noun pairs (notice how 'strong coffee' sounds...

an exercise in handling mislabelled training data

October 03, 2011 at 08:00 PM | categories: Uncategorized

as part of my diy twitter client project i've been using the twitter sample streams as a sourceof unlabelled data for some mutual information analysis. these streams are a great source of random tweets but include a lot of non...

dimensionality reduction using random projections.

May 10, 2011 at 08:31 PM | categories: Uncategorized

previously i've discussed dimensionality reduction using SVD and PCA but another interesting technique is using a random projection.in a random projection we project A (a NxM matrix) to A' (a NxO, O < M) by the transform AP=A' where P...