<< the base algorithm
index
generating test data >>

## distributing

can this algorithm be modified to work in a distributed environment?

yes! with only a minor change.

it turns out the moving of elements in the list to before the pivot doesn't actually have to be done.

all that we are actually interested in where the pivot *would* end up.

this can be derived by counting the number of elements less than the pivot and adding 1.

revisiting the previous examples then;

#### pivot to the right of the median postition

L1 = [6 3 1 1 5 __9__ 2 4 5 3 7]

if we count the number of elements < 6 we get 8

so 6 is the 9th order stat

since the 9th > 6th we need to discard all values >=6

L2 = [3 1 1 5 2 __4__ 5 3]

and continue looking for the 6th order stat
#### pivot to the left of the median postition

L1 = [2 3 1 1 5 __9__ 3 6 5 3 5]

if we count the number of elements < 2 we get 2

so 2 is the 3rd order stat

since 3rd < 6th we need to discard all values < 2

L2 = [2 3 5 __9__ 3 6 5 3 5]

in this case we need to rotate since the pivot ended up in the first position

L2 = [3 5 9 __3__ 6 5 3 5 2]

and continue looking for the 6-2 = 4th order stat
#### counting not moving

so we never had to actually move any numbers around.

what things do we end up doing against the list elements?

- count the total number of elements
- determine the min and max
- count number of elements less than a particular value
- discard all elements less than a particular value
- discard all elements greater than or equal to a particular value
- rotate the head of the list to the tail

all these things can be easily applied in a distributed environment where we are dealing with not just one list but a set of lists across machines.

the only non trivial thing is the rotation, and we'll soon see it's not that hard.

<< the base algorithm
index
generating test data >>

nov 2008