Posts Tagged ‘c++’

fastmap and the jaccard distance

Friday, October 31st, 2008

given a set of pairwise distances how do you determine what points correspond to those distances?

my latest experiment considers this problem in relation to jaccard distances, a resemblance measure similar to jaccard coefficients used in a previous experiment

by using the fastmap algorithm we get points from distances and once you have points you have visualisation!

openmp = easy multi threading

Monday, October 13th, 2008

openmp is a compiler library, available in gcc since v4.2, for giving hints to a compiler about where code can be parallelized.

say we have some code

for(int i=0; i<HUGE_NUMBER; ++i)
  deadHardCalculation(i);

we can make this run on multi threaded by simply adding some pragmas

#pragma omp parallel num_threads(4)
{
  #pragma omp for
  for(int i=0; i<HUGE_NUMBER; ++i)
    deadHardCalculation(i);
}

compiling with -fopenmp will generate an app that splits the work of the for loop across 4 threads.

there’s support for dynamic / static scheduling, accumulators, all sorts

this tute is awesome.

it increased the speed of my shingling code by 350% on a quad core box with just the above two lines

shingling and the jaccard index

Monday, October 6th, 2008

on the weekend i did another experiment using shingling and the jaccard index to try to determine if two sets of data were “duplicates”

it works quite well and includes a ruby and c++ version with low level bit operations.

project page is www.matpalm.com/resemblance

code at github.com/matpalm/resemblance

i was going to put the discussion here but the page ended up too long, next time i’ll break it into chunks, maybe.

java is the new c++

Sunday, October 5th, 2008

this year would have been my ten year anniversary of commercially coding in java. it’s not going to be though since the last six months have been ruby. even with my huge investment in java i’d be quite happy to never write a line of it again.

i remember when java was first moving in. it was not as performant as c/c++ but it was much easier to write good clean code. and who really cares about performance? scalability is what matters and it’s decided by design and architecture, not language choice. as a new language java made sure it had good integration with c/c++ to make use of the many existing libraries available. java performance improved and with machines constantly getting faster it became less of a problem for the majority of business problems.  java could always fall back to c libraries for those things that really were time critical, usually a well defined small section of code. java replaced c++ for most problems and the newly converted java programmers would be happy to never write c++ again. programs written in java were shorter (less boiler plate) and hence easier to write and, more importantly, read. smaller code also meant fewer places for bugs to hide.

now reread that last paragraph and substitute ruby for java and java for c++.

java is the new c++.