here's an (semi contrived) example close to something i did the other day to show how awesome it is
say you have a number of largish presorted files; run-00 to run-03; and you want to find the most frequent lines. you could do something like the following...
sort -m run-* | uniq -c | sort -nr | head
however you'll know that from previous posts i just loooove keeping all my data compressed on disk so instead i've got run-00.gz to run-03.gz
without having to uncompress the files to disk i'd have to do something like this...
zcat run*gz | sort | uniq -c | sort -nr | head
but this pains me since it results in completely resorting the stream. i know the input files are sorted so i'd much prefer doing a sort -m than sort
so how can i mix the combo of zcat and a pipe to sort with sort -m wanting the multiple inputs as file descriptors instead of STDIN?
well, mkfifo of course! it's a way of making a file that acts like a pipe ( a named pipe )
ls | sort
is sort-of, roughly, equivalent to
mkfifo bob ls > bob & sort < bob rm bob
( have to background the ls since the write to the named pipe blocks until the read starts )
apart from being a cool way to get pipes working between totally seperate processes on a box this provides a solution for our original problem
mkfifo p0 p1 p2 p3 zcat run-00.gz > p0 & zcat run-01.gz > p1 & zcat run-02.gz > p2 & zcat run-03.gz > p3 & sort -m p* | uniq -c | sort -nr | head rm p
and all four zcat can burn cpu while avoiding the need to resort.