burn it!

# should i burn it?

## brute force

usually brute force, ie try every combo, is cause for a heartattack but it's actually not so bad in this case...
there are only 16 folders to consider on the backup and each is either included or it's not
gives a total of 216 or 65,536 combos to try.
well it's 216-1 really since the case of nothing on the disk is not that exciting

anyways, here's an impl of a brute_force.rb in ruby
and even though ruby is known for it's lack of performance it still gets the global optimum in under 1.5 seconds on my machine

it makes use of the binary representation of a number to denote whether a folder is on the disk or not
eg for 3 items or varying sizes...

 item size 1 4 2 6 3 3

...we can try all 23 combos

 n binary includeitem size 0 000 { } 0 1 001 { 1 } 4 2 010 { 2 } 6 3 011 { 2 1 } 6+4=10 4 100 { 3 } 3 5 101 { 3 1 } 3+4=7 6 110 { 3 2 } 3+6=9 7 111 { 3 2 1 } 3+6+4=13

but if we're going to resort to brute force we might as well add a little bit of elegance...

consider the order we try the combos and what calculations are actually done

 n binary includeitem description 0 000 { } empty the disk and add nothing 1 001 { 1 } empty the disk and add item 1 2 010 { 2 } empty the disk and add item 2 3 011 { 2 1 } empty the disk and add item 2 and item 1 4 100 { 3 } empty the disk and add item 3 5 101 { 3 1 } empty the disk and add item 3 and item 1 6 110 { 3 2 } empty the disk and add item 3 and item 2 7 111 { 3 2 1 } empty the disk and add item 3 and item 2 and item 1

if we were doing this manually we'd be too lazy to test the combos like this;
instead consider if we tried the combos in a slightly different order by changing only one bit at time
(analogous to removing or adding just one item at time)
this type of order is called a grey encoding and can be calculated easily using g(n)=n^(n>>1)

 n g(n) binary includeitem size 0 0 000 { } 0 1 1 001 { 1 } 4 2 3 011 { 2 1 } 6+4 = 10 3 2 010 { 2 } 6 4 6 110 { 3 2 } 3+6 = 9 5 7 111 { 3 2 1 } 3+6+4 = 13 6 5 101 { 3 1 } 3+4 = 7 7 4 100 { 3 } 3

using this order we can calculate the total size using a delta from the last size rather than doing the sum every time

 g(n) binary includeitem description size 0 000 { } empty the disk 0 1 001 { 1 } add item 1 0+4 = 4 3 011 { 2 1 } add item 2 4+6 = 10 2 010 { 2 } take item 1 out 10-4 = 6 6 110 { 3 2 } add item 3 6+3 = 9 7 111 { 3 2 1 } add item 1 9+4 = 13 5 101 { 3 1 } take item 2 out 13-6 = 7 4 100 { 3 } take item 1 out 7-4 = 3

this guarantee of only one add or subtract per combo really pays off when there are more items.
here's some code for brute_force_2.rb using grey coding
and an implementation in c too brute_force2.c

so brute force is awesome,
until we add a few more folders...

 # folders # combos brute_force.rbtime brute_force2.rbtime brute_force2.ctime 16 216 = 65,536 1.5 sec 0.5 sec 0.01 sec 20 220 = 1,048,576 25 sec 5 sec 0.1 sec 30 230 = 1,073,741,824 ? 129 min 2 min 19 sec 40 240 = 1,099,511,627,776 ? ~ 3 months ~ 1.5 days 50 250 = 1,125,899,906,842,624 ? ? ~~ 4.5 years

yikes! it's easy to see then why algorithms that are exponential in the input size are not the best.
in fact 270 folders is 1,897,137,590,064,188,545,819,787,018,382,342,682,267,975,428,761,855,001,222,473,056,385,648,716,020,711,424 combos
more combos than the number of atoms in the universe
(i've always wanted to say that)

let's try the other extreme of naive algorithms, the random walk!

Dec 2008