DMTCP is too long for a rap-​​star name

…but it’s not too long for an awe­some acronym that can help big time, power hungry com­pu­ta­tional pro­grams do their jobs better!

Gene Coop­erman of the Col­lege of Com­puter and Infor­ma­tion Sci­ences began tin­kering with par­allel com­puting over a decade ago, exploring the pos­si­bility of using 10 com­puters to do in 1 hour the job 1 com­puter can do in 10 hours. In 2009 he used this method to solve a Rubik’s cube in a record 26 moves.

But he kept coming across the same problem: Say 10 com­puters have been com­bining their power to run a pro­gram that sifts through thou­sands of mol­e­cules searching for that needle-​​in-​​a-​​haystack cancer drug for the last six days. One of those com­puters is mine and I sud­denly decide that it’s crit­ical for me to play soli­taire ASAP, so I pull my com­puter out of the net­work. Well…I just screwed every­thing up. The last six days worth of work go down the drain.

Wouldn’t it be nice if I could just pause the pro­gram for a couple of min­utes to get my soli­taire fix, and then let the pro­gram con­tinue mol­e­cule mining where it left off?

Pro­gram­mers can write code that enables “checkpoint-​​restart” capa­bil­i­ties to do this job, but they need to do it every time a pause is antic­i­pated. Before I play soli­taire, I’d need to write some new code.

DMTCP — which stands for Dis­trib­uted Mul­ti­Thread Check­Pointing (DMC would have been an easier acronym to remember, but it was already taken0). What the heck does that mean? I asked Coop­erman a sim­ilar ques­tion yesterday.

The pro­gram hovers in the back­ground of the mol­e­cule miner (or any other pro­gram) and stealthily check­points — or saves — the state affairs. When I come in to play soli­taire DMTCP forces the mol­e­cule miner to stop, save every­thing and wait till I’m done. When I decide it’s time to stop pro­cras­ti­nating, the mol­e­cule miner starts up again from the checkpoint.

DMTCP is the most widely used checkpoint-​​restart pro­gram of its kind because it can run trans­par­ently in the back­ground without inter­rupting any­thing. Also, said Coop­erman yes­terday, “Our main goal is to be com­pletely gen­eral pur­pose.” It is flex­ible in that it can be used with a broad range of pro­gram types.

Five years ago, said Coop­erman, there was no pro­gram like this so researchers across a variety of fields simply couldn’t explore cer­tain ques­tions. “Now there’s a new tool and now you show it to people they say ‘oh, I have some­thing else I can use this tool for,’ com­pletely dif­ferent from what you devel­oped it for.”

So, essen­tially, DMTCP has the poten­tial to enable com­pletely novel inves­ti­ga­tions using a variety of com­pu­ta­tional techniques.