There was a time in ancient computer history when a computer only had one CPU. Today, your computer may still only have a single physical CPU, but that one CPU has multiple cores for data processing. When you run a command, you owe it to the brave sysadmins of the past to actually put all those cores to good use. One way to honour those who suffered on single core machines is to use GNU Parallel, the seemingly magical command parser that can execute a task on several files *at once*.

Install parallel {#_install_parallel}

On CentOS, RHEL, and Fedora, you can install GNU Parallel from your software repository:

On CentOS and RHEL, you can sometimes find the latest version from

EPEL

instead.

First launch {#_first_launch}

The first time you use GNU Parallel, you're asked to agree to cite your use of `parallel` in scientific research.

Read the notice and follow the instructions to silence the reminder.

Piping output to parallel {#_piping_output_to_parallel}

Assuming you're already familiar with the

find

command, one of the easiest ways to get started with GNU Parallel is to feed it with the results of `find`. For instance, suppose you want to manually archive some log files (ignore, for the moment, that you're using

logrotate

or a similar tool in real life.)

You may already know how to find old files. For instance, this command finds files that haven't been modified in 24 hours times 30 (that's approximately a month):

You could take each result of `find` and either `exec` or pipe to `tar` to create an archive. But it's just as easy, and maybe even noticeably faster (depending on the size of the log files) to use `parallel` instead:

In this code, the braces (`{}`) stand in for *the results of find*.

Parallel syntax {#_parallel_syntax}

While `find` can act as a convenient "front end" for Parallel, you can also just use the `parallel` command to construct processes. The concept is simple, although the logic can sometimes get complex, depending on how many tasks you want to run. Starting out simple, here's a basic `parallel` command:

The instruction, as you may be able to tell, is separated by three semi-colons (`:::`), with the command on the left and the arguments on the right. If you try that command, you might get <code>hello world</code> or <code>world hello</code> back, depending on which process manages to get completed first.

Suppose you want to convert some large media files. Instead of encoding the files one after another, you can instead use GNU Parallel to launch separate instances of your encoder, each one targeting a different codec:

Multiple variables {#_multiple_variables}

Parallel isn't limited to just one `{}` variable. You can create several inputs, and then define them by an index number reflecting the order they're listed. Compare this output:

In this code sample, `{1}` indicates the first "block" of input (`hello` and `Linux`,) while `{2}` indicates the second "block" (`world` and `sysadmin`.) They don't have to appear in that order, nor are they limited to a single-use:

Parallel processing {#_parallel_processing}

They say that with great power comes great responsibility, but ideally with great power also comes great parallelization. The computer in front of you right now is probably more powerful than what you need most of the time, so you may as well make your everyday commands faster by taking advantages of otherwise wasted cycles. Use GNU Parallel.

Proxied content from gemini://sdf.org/klaatu/geminifiles/sysadmin-parallel.gmi (external content)

Gemini request details:

Original URL
gemini://sdf.org/klaatu/geminifiles/sysadmin-parallel.gmi
Status code
Success
Meta
text/gemini
Proxied by
kineto

Be advised that no attempt was made to verify the remote SSL certificate.