[prog] bash: string comparisons in conditionals

Tue Apr 20 11:20:51 EST 2004

On Tue, 20 Apr 2004, Riccarda Cassini wrote:
>> >  null == undefined == not initialized == not existing :=
>> >                                 variable doesn't exist at all
>> >                                 or hasn't been assigned anything yet
>> >
>> >  empty := string that doesn't contain anything (zero-length)
>> >
>> >  zero  := numeric value '0'
>> >
>> >Is that correct?
>>
>> There's no one true definition for any of these.  In some contexts, one
>> can be substituted for another.
>
>I see... you mean one can say things like "null" to mean "null-string"
>and so on... Doesn't necessarily make it easier to know what precisely
>people are talking about, but I guess that's natural languages...

Yes.  Another example is null and 0.  In C++, the following can be read
as:

	if (ptr != 0) // Test for a null pointer.
		;

Also, I've just re-read the question.  My response applies to null, empty
and 0.  I noticed you also had 'undefined', 'not initialized' and 'not
existing'.  These later terms have a very specific definition.
Especially for C and C++.  And most other languages usually have a
specific definition of these terms as well.  The last term is usually
spelled as 'non-existing', rather than 'not existing'.  You should never
inter-change these words.  Otherwise, you'll like get confusion as a
result.

One other term I should mention is--'not defined'.  At least in C/C++,
this is different from 'undefined'.  Both specifies a behavior of some
language construct.  'Not defined' says the construct behavior was not
specified by the standard.  While 'undefined' says the construct behavior
can be anything (which is specifically specified by the standard).

To the programmer, these two terms are basically the same.  But to a
compiler writer, their meaning is very different.

>I thought the quotes primarily serve to prevent the argument from
>disappearing altogether (syntactically) if it's empty.

You're right.  I've just tried some tests and [ -z $a ] works as expected
without the qoutes.  As a matter of fact, I've also tried [ x$a == x ],
and this works as well.  So qoutes apparently doesn't do anything except
maybe used as a delimeter (i.e. [ "$a"x == x ]).

>What had me stumped particularly was, that if you write [ -z $a ] and $a
>is empty or undefined, you don't get a syntax error as in [ $a = "" ].

You might want to use the term 'unset', instead of 'undefined'.

As for why -z accepted an unset variable.  I can only assume it's because
of the implementation.  With a binary operator, a missing operand makes
the test look like a unary operator or an unset operand.  Without knowing
which case it is, the shell shouldn't make any assumptions.  Thus it
errors out.  While with a unary operator, a missing operand will still
make it a unary operator.  So the shell can safely assume the operand was
unset.

>In perl, for example, you can write 'if ($a eq "")' without there being
>a problem with $a being empty (sorry for bringing up perl all the time,

I didn't read how Perl processes a perl script.  But if it's anything like
Tcl, then the reference to $var also defines it.  I.e. you don't have to
explicitly define the variable.  The act of using it implicitly defines
it.

>In bash, on the other hand, it seems to me, that variable expansion
>happens *before* the code is passed to the parser.  So, assuming that $a
>holds the value 'test', the parser would see the literal string 'if [
>test = "" ]' (for the above example).  Or, if $a is empty: 'if [ = "" ]'.
>At least, that's what I figured from the error messages I got. I could be
>totally wrong, though.

Sounds right.

>        Sorry for being repetitive and long-winded [*] - I'm always
>trying to understand the basic principle behind things, so I can work
>out the solution myself next time (ideally :-)

This is actually a trait of a good programmer.  So don't be sorry.

--jc
-- 
Jimen Ching (WH6BRR)      jching at flex.com     wh6brr at uhm.ampr.org