Sunday, July 22, 2012

xargs and race conditions

I tend to use xargs as a quick paralleling tool. It works great. However, you may notice that due to the multiple processes running, output from one process will overlap with another process creating an unreadable mess.  As an example, I want to ping 6 addresses at the same time. I can do it serially at 1 ping/sec for a total of 6 seconds. Linear growth or O(n):

How do you get around it? We can run everything in parallel using xargs

Note that the time taken is markedly reduced but.. Output from each process is interleaved with other commands.

How do you get around that? Well, locking.. Luckily, the shell has a nifty locking utility that is just perfect for this. Enter flock..

Still as fast and neater. Man flock for more awesomeness. The flock utility uses the flock syscall to acquire a lock on a file structure. In the example above, I show that you can use a file that you are reading as a lock file. flock doesn't prevent you from conducting other file ops on your fd.