<<  distributing    index    ruby implementation  >>

generating test data

scripts

i've built two scripts to help in generating test data

the first generates numbers between two values with a specific median value, it's cryptically called generate_test_data.rb

bash> ./generate_test_data.rb min_value median_value max_value number_of_values (optional_seed)

the second spreads values from stdin evenly over a number of files, it's even more cryptically called spread_across_files.rb

bash> ./spread_across_files.rb file_prefix number_of_files

eg

bash> ./generate_test_data.rb 200 275 300 5e6 | ./spread_across_files.rb num 4
generates files num.0, num.1, num.2 and num.3,
each with about 1,250,000 numbers
with values between 200 and 300
and an overall median of 275

to be honest though spread_across_files is a bit of reinventing the wheel, split is just as good.

<<  distributing    index    ruby implementation  >>

nov 2008