Preventing Shell Injections

Everyone and their dog knows about SQL injection. Shell injection is less popular but still worth talking about.

We’ll focus on python, but the general concepts are true for most programming languages.

Anatomy of a Shell Injection

This is really similar to SQL injection idioms. Like so:

#!/usr/bin/env python3
import os

name = input('Enter your name: ')
os.system('echo "Your name is %s"' % name)

All is well and good until Bobby Tables’ cousin Robert"; rm –rf ~; # comes along, then everything goes sideways.

Why Does This Matter?

SQL injections and XSS make headlines because there’s a big attack surface. In contrast with that, shell invocations tend to be disconnected from web or API input, if they’re on the same machine at all. You might write a script peppered with os.system() for personal use on your own laptop. Why worry about injection if you’re the only user?

The short answer: it’s bad practice. Good habits are like insurance, by the time you know you need them it’s too late.

Preventions

№ 0: Use Internal Methods

When practical, this is the best solution – just don’t use external commands.

For our toy script this is super easy:

#!/usr/bin/env python3

name = input('Enter your name: ')
print("Your name is %s" % name)

№ 1: Ditch the Shell

Switching over to internal methods isn’t always so easy. Consider this function:

#!/usr/bin/env python3
import os
# ...
def save_url(url, dest):
    err = os.system("curl -sS -- '%s' >> '%s'" % (url, dest))
    if err:
        raise RuntimeError("Failed to download %s!" % url)

You could replace curl with a python library like urllib or requests, but that could be a lot of work if, say, you need to perfectly replicate the old error behavior, stderr content and all. (Don’t you just love legacy code?)

A little shucking goes a long way:

#!/usr/bin/env python3
import subprocess
# ...
def save_url(url, dest):
    with open(dest, 'a') as wfh:
        err = subprocess.call(['curl', '-sS', '--', url], stdout=wfh)

    if err:
        raise RuntimeError("Failed to download %s!" % url)

This directly invokes curl without using a shell. Python handles command construction and output redirection itself.

In a lot of cases you can use -o <file> (or similar) to dispense with redirection entirely.

№ 2: Fix the Shell

While the above methods are elegant, converting existing os.system() calls is non-trivial. Consider this function:

#!/usr/bin/env python3
import os
# ...
def save_url(url, dest):
    cmd = "set -o pipefail; curl -sS -- '%s' | tr -d '\r' | gzip -c > '%s'"

    err = os.system(cmd % (url, dest))
    if err:
        raise RuntimeError("Failed to download %s!" % url)

Now how do we fix that?

In your mind you can start working on how to do this without a shell.

curl to a temp file is simple enough… but then gzip can’t start until curl is done… can I use python to pipe curl straight to tr and then to gzip? … maybe urllib or requests provides a read handle I can give to tr… or should I replicate tr internally with str.replace() and then push the result to gzip… do I need multiple threads for this? …

What. A. Headache.

Thankfully there’s a way to keep the shell but remove the injection vulnerability.

Consider the value of cmd in the function above. It’s not just a system command, it’s also a tiny program. We can rewrite it as such:

#!/bin/sh
set -o pipefail

curl -sS -- '%s' | tr -d '\r' | gzip -c > '%s'

Or more accurately, it’s a template for a tiny program. Each call to save_url() creates and runs a different variant based on the arguments given.

We can prevent injection by removing the %s formatting and making it the same every time:

#!/bin/sh
set -o pipefail

curl -sS -- "$1" | tr -d "\r" | gzip -c > "$2"

Then we use subprocess.call() to invoke this with arguments.

#!/usr/bin/env python3
import subprocess
# ...
def save_url(url, dest):
    cmd = 'set -o pipefail; curl -sS -- "$1" | tr -d "\r" | gzip -c > "$2"'

    err = subprocess.call([cmd, '', url, dest], shell=True)
    if err:
        raise RuntimeError("Failed to download %s!" % url)

From the shell’s perspective, $0 (unused) is empty, $1 is the url, and $2 is the dest.

A Word of Caution

The above mitigations fix arbitrary shell injections, but there may still be other vulnerabilities.

For example: save_url() downloads an arbitrary url and writes it to an arbitrary file path, and it does this by design. dest could be ~/.bashrc or /etc/passwd and the function would (try to) write to it without a second thought.

Bad Ideas

These can work if you have no other choice but each has its own issues.

Using Escapes

Here’s an example using shlex:

#!/usr/bin/env python3
import os, shlex
# ...
def save_url(url, dest):
    ## escape url and dest
    url  = shlex.quote(url)
    dest = shlex.quote(dest)

    err = os.system("curl -sS -- '%s' >> '%s'" % (url, dest))
    if err:
        raise RuntimeError("Failed to download %s!" % url)

The problem is, it’s just too easy to do it wrong. The above looks ok at first glance, and will work 90% of the time, but it’s still vulnerable to injection.

This is because we used '%s' instead of %s. shlex.quote() works largely by wrapping things in single quotes, which our single quotes cancel out.

If url is example.com or blog.haxxor.io then this doesn’t matter. But if url is ; rm –rf ~ # then shlex.quote(url) is '; rm –rf ~ #' and our shell command becomes

curl -sS -- ''; rm –rf ~ #'' >> 'some-dest'

In a way this is skill issue – do it right and the problem won’t exist – but it’s too precarious. One mistake and you have a largely-silent vlunerability.

Disallowing Characters

You can do a surprising amount of injection defense just by disallowing single-quotes in the arguments.

#!/usr/bin/env python3
import os
# ...
def save_url(url, dest):
    if "'" in url:
        raise RuntimeError('Invalid URL "%s"!' % url)

    if "'" in dest:
        raise RuntimeError('Invalid destination "%s"!' % dest)

    err = os.system("curl -sS -- '%s' >> '%s'" % (url, dest))
    if err:
        raise RuntimeError("Failed to download %s!" % url)

This works because shells generally don’t perform parsing within single-quotes. The only parsed character is the second single-quote that ends the string. Not even backslashes are special.

Two reasons this is a bad idea:

You may need to have single-quotes in your arguments at some point. It’s unlikely for a URL or file path, but there are plenty of other scenarios out there. Just ask python core developer Steven D’Aprano.
The shell may have a bug which lets you break out of single quotes, e.g. with an exceedingly long string. You could argue it’s not your responsibility to compensate for a faulty shell – and you may be right – but why tempt fate?

Using Environment Variables

This is a shocking one (heh). What’s so unsafe here?

#!/usr/bin/env python3
import os, subprocess
# ...
def save_url(url, dest):
    cmd = 'set -o pipefail; curl -sS -- "$URL" | tr -d "\r" | gzip -c > "$DEST"'

    err = subprocess.call(cmd, env={**os.environ, 'URL': url, 'DEST': dest})
    if err:
        raise RuntimeError("Failed to download %s!" % url)

The answer is that older versions of bash put a little too much trust in environment variables. In particular, it could parse them as function definitions… and any code trailing the function definition was immediately executed.

This is Shellshock. It was introduced in 1989 and finally discovered and patched in 2014.

In the above code, Shellshock can be invoked on a vulnerable system by setting url or dest to () { :;}; rm –rf ~ or similar.

The real solution is to stay up-to-date on security patches. But it’s also wise to avoid invoking old vulnerabilities if possible. Calling the shell command with arguments (№ 2: Fix the Shell) provides similar functionality without any shocking surprises.

Choosing Your Shell

Across various languages and protocols, “running a shell command” involves invoking /bin/sh -c <cmdline> or similar.

As a result, these two are usually equivalent:

# run a command with shell
subprocess.call([cmdline, label, *args], shell=True)

# invoke the shell yourself
subprocess.call(['/bin/sh', '-c', cmdline, label, *args])

Side note: Ruby automatically disables the shell if you provide multiple arguments, so it only supports the second form:

# invoke the shell yourself in ruby
system('/bin/sh', '-c', cmdline, label, *args)

Sometimes (like MacOS) /bin/sh is actually the Bourne shell, while other times (many Linux distros) it’s bash pretending to be the Bourne shell. As a result, bashisms like <(...) will work on some systems but not others. This is extra confusing if both systems support bash.

But with the second form, you’re in control of which shell is used. If you need to use bashisms you can use /bin/bash rather than relying on /bin/sh being bash in disguise.

Final Thoughts

Shell commands make their way into lots of programs, be they python, ruby, PHP, etc. Sometimes it’s just the way the programmer knows how to do it, other times it’s really the only practical way to do something.

This makes me slightly uneasy because shells were not originally made to deal with untrusted input – at least, not at the level we see today. Shellshock exists because developers in the 80s did not anticipate malicious environment variables.

Compounding this, it seems many languages want you to do it the wrong way. Python has os.system(), ruby has %x(), PHP has system(), all of which run a shell command with no arguments. If you want to disable the shell, or run a shell command with arguments, you suddenly have to use a different set of commands, and conventions across languages vary.

Shells are old and powerful. Use with care.

Preventing Shell Injections

Created:	2020-12-21
Updated:	2023-08-06
Tags:	bash, python, security