# brain of mat kelcey

## e10.6 community detection for my twitter network

April 04, 2010 at 12:58 PM | categories: e10, twitter, betweenness, social network, graph | View Comments

last night i applied my network decomposition algorithm to a graph of some of the people near me in twitter.first i build a friend graph for 100 people 'around' me (taken from a crawl i did last year). by 'friend' i mean that if alice follows bob then bob also follows alice.here the graph, some things to note though; it was an unfinished crawl (can a crawl of twitter EVER be finished) and was done october last year so is a bit out of date.moreand here is the dendrogram decompositionsome interesting clusterings come out..right at the bottom we have a...

## e10.5 revisiting community detection

March 30, 2010 at 08:42 PM | categories: e10, betweenness, social network, graph | View Comments

i've decided to switch back to some previous work i did on community detection in (social) graphsthe last chunk of code i wrote which tried to deal with weighted directed graphs was terribly, terribly, broken but it seems that simplifying to undirected graphs is giving me much saner results. yay!here's an example of my work in progress generated from the new version of the codeconsider the graphand it's corresponding decompositionthe results are reasonable; the initial breaking of clusters [1,2,3,4,5,6] and [7,8,9,10,11,12] is the most obvious but some of the others are not as intuitive[1,2,5] and [7,8,10] remain as unbreakable cliques...

## e10.4 communities in social graphs

October 06, 2009 at 08:05 PM | categories: e10, twitter, social network, betweenness, algorithms, graph | View Comments

social graphs, like twitter or facebook, often follow the pattern of having clusters of highly connected components with an occasional edge joining these clusters.these connecting edges define the boundaries of communities in the social network and can be identified by algorithms that measure betweenness.the girvan-newman algorithm can be used to decompose a graph hierarchically based on successive removal of the edges with the highest betweenness.the algorithm is basicallycalculate the betweenness of each edge (using an all shortest paths algorithm)remove the edge(s) with the highest betweennesscheck for connected components (using tarjan's algorithm)repeat for graph or subgraphs if graph was split...

old projects...

- latent semantic analysis via the singular value decomposition (for dummies)
- semi supervised naive bayes
- statistical synonyms
- round the world tweets
- decomposing social graphs on twitter
- do it yourself statistically improbable phrases
- should i burn it?
- the median of a trillion numbers
- deduping with resemblance metrics
- simple supervised learning / should i read it?
- audioscrobbler experiments
- chaoscope experiment