January 28, 2010 at 08:18 PM | categories: linguistics, e12 | View Comments

i've been doing some reading for my statistical synonyms project and have uncovered a heap of cool papers. most of them are around an idea (from the 1950's!) called the distributional hypothesis that simply states that words that appear in similar contexts often have similar meanings.the coolest paper so far is 'Web-Scale Distributional Similarity and Entity Set Expansion' by Pantel,Crestan,Borkovsky et al which has introduced me to an area of research i didn't really know existed; entity set expansion.entity set expansion is a bit like thesaurus building for proper nouns; given a seed set of related items can you expand...

old projects...

latent semantic analysis via the singular value decomposition (for dummies)
semi supervised naive bayes
statistical synonyms
round the world tweets
decomposing social graphs on twitter
do it yourself statistically improbable phrases
should i burn it?
the median of a trillion numbers
deduping with resemblance metrics
simple supervised learning / should i read it?
audioscrobbler experiments
chaoscope experiment

brain of mat kelcey

e12.2 entity set expansion