Home » Articles » Why is printf better than echo in Shell Scripting?

Why is printf better than echo in Shell Scripting?

Basically, it’s a portability (and reliability) issue.

Initially, echo didn’t accept any option and didn’t expand anything. All it was doing was outputting its arguments separated by a space character and terminated by a newline character.

Now, someone thought it would be nice if we could do things like echo "\n\t" to output newline or tab characters, or have an option not to output the trailing newline character.

They then thought harder but instead of adding that functionality to the shell (like perl where inside double quotes, \t actually means a tab character), they added it to echo.

David Korn realized the mistake and introduced a new form of shell quotes: $'...' which was later copied by bash and zsh but it was far too late by that time.

Now when a standard Unix echo receives an argument which contains the two characters \ and t, instead of outputting them, it outputs a tab character. And as soon as it sees \c in an argument, it stops outputting (so the trailing newline is not output either).

Other shells/Unix vendors chose to do it differently: they added a -e option expand escape sequences, and a -n option to not output the trailing newline. Some have a -E to disable escape sequences, some have -n but not -e, the list of escape sequences supported by one echo implementation is not necessarily the same as supported by another.

Sven Mascheck has a nice page that shows the extent of the problem.

On those echo that support options, there’s generally no support of a -- to mark the end of options (zsh and possibly others support - for that though), so for instance, it’s difficult to output "-n" in many shells.

On some shells like bash1 or ksh932, the behavior even depends on how the shell was compiled or the environment. So two bash echos, even from the same version of bash are not guaranteed to behave the same.

POSIX says: if the first argument is -n or any argument contains backslashes, then the behavior is unspecified. bash echo in that regard is not POSIX in that for instance echo -e is not outputting -e<newline> as POSIX requires. The Unix specification is stricter, it prohibits -n and requires expansion of some escape sequences including the \c one to stop outputting.

Those specifications don’t really come to the rescue here given that many implementations are not compliant.

All in all, you don’t know what echo "$var" will output unless you can make sure that $var doesn’t contain backslash characters and doesn’t start with -. The POSIX specification actually does tell us to use printf instead in that case.

So what that means is that you can’t use echo to display uncontrolled data. In other words, if you’re writing a script and it is taking external input (from the user as arguments, or file names from the file system…), you can’t use echo to display it.

This is OK:

echo >&2 Invalid file.

This is not:

echo >&2 "Invalid file: $file"

(Though it will work OK with some (non Unix) echo implementations like bash‘s when the xpg_echo has not been enabled in one way or another like at compilation time or via the environment).

printf, on the other hand is more reliable, at least when it’s limited to the basic usage of echo.

 printf '%s\n' "$var"

Will output the content of $var followed by a newline character regardless of what character it may contain.

 printf %s "$var"

Will output it without the trailing newline character.

Now, there also are differences between printf implementations. There’s a core of features that is specified by POSIX, but then there’s a lot of extensions. For instance, some support a %q to quote the arguments but how it’s done is shell specific, some support \uxxxx for unicode characters. The behavior varies for printf '%10s\n' "$var" in multibyte locales, there are at least three different outcomes of printf %b '\123'

But in the end, if you stick to the POSIX feature set of printf and don’t try doing anything fancy with it, you’re out of trouble.

But remember the first argument is the format, so shouldn’t contain variable/uncontrolled data.

A more reliable echo can be implemented using printf, like:

echo() ( # forking for local scope for $IFS
  IFS=" " # needed for "$*"
  printf '%s\n' "$*"
)

echo_n() (
  IFS=" "
  printf %s "$*"
)

echo_e() (
  IFS=" "
  printf '%b\n' "$*"
)

The fork can be avoided using local IFS on Linux (the LSB specification mandates local for Linux sh), or by writing it like:

echo() {
  if [ "$#" -gt 0 ]; then
     printf %s "$1"
     shift
  fi
  if [ "$#" -gt 0 ]; then
     printf ' %s' "$@"
  fi
  printf '\n'
}

Notes

1. how bash‘s echo behaviour can be altered.

With bash, at run time, there are two things that control the behaviour or echo: the xpg_echo bash options and whether bash is in posix mode. posix mode can be enabled if bash is called as sh or if POSIXLY_CORRECT is in the environment or with the the posix option:

Default behaviour on most systems:

$ bash -c 'echo -n "\0101"'
\0101% # the % here denotes the absence of newline character

xpg_echo expands sequences as Unix requires:

$ BASHOPTS=xpg_echo bash -c 'echo "\0101"'
A

It still honours -n and -e (and -E):

$ BASHOPTS=xpg_echo bash -c 'echo -n "\0101"'
A%

With xpg_echo and POSIX mode:

$ BASHOPTS=xpg_echo POSIXLY_CORRECT=1 bash -c 'echo -n "\0101"'
-n A
$ BASHOPTS=xpg_echo ARGV0=sh bash -c 'echo -n "\0101"' # The ARGV0=sh is to pass argv[0] here
-n A
$ BASHOPTS=xpg_echo SHELLOPTS=posix ARGV0=sh bash -c 'echo -n "\0101"'
-n A
$ BASHOPTS=xpg_echo SHELLOPTS=posix bash -c 'echo -n "\0101"'
-n A

This time, bash is both POSIX and Unix conformant. Note that in POSIX mode, bash is still not POSIX conformant as it doesn’t output -e in:

 $ SHELLOPTS=posix bash -c 'echo -e'

 $

The default values for xpg_echo and posix can be defined at compilation time with the –enable-xpg-echo-default and –enable-strict-posix-default options to the configure script. That’s typically what recent versions of OS/X do to build their /bin/sh. No Unix/Linux implementation/distribution in their right mind would typically do that for /bin/bash though.

2. How ksh93‘s echo behaviour can be altered.

In ksh93, whether echo expands escape sequences or not and recognises options depends on the content of $PATH.

If $PATH contains a component that contains /5bin or /xpg before the /bin or /usr/bin component then it behave the SysV/Unix way (expands sequences, doesn’t accept options). If it finds /ucb or /bsd first, then it behaves the BSD way (-e to enable expansion, recognises -n). The default is system dependant, BSD on Debian:

$ ksh93 -c 'echo -n' # default -> BSD (on Debian)
$ PATH=/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /xpg before /bin or /usr/bin -> XPG
-n
$ PATH=/5binary:$PATH ksh93 -c 'echo -n' # /5bin before /bin or /usr/bin -> XPG
-n
$ PATH=/ucb:/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /ucb first -> BSD
$ PATH=/bin:/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /bin before /xpg -> default -> BSD

 

Originally written by Stéphane Chazelas twitter stackexchange
Original post

 

Leave a Reply