[prog] Strings in C

Mary mary-linuxchix at puzzling.org
Sat Oct 19 07:21:50 EST 2002


On Fri, Oct 18, 2002, k.clair wrote:
> Hello,
> 
> I don't have an answer, but I do have another question :-D
> 
> I've only learned a tiny bit of C, but I was under the impression that
> character arrays had to have the \0 character at the end of them in
> order to be treated like a string.  Well, this confuses me ... do they
> *have* to end with \0, or is it just convenient to have at the end for
> working with them?

Many of the string library functions use the \0 character to detect the
end of the string. So strlen is implemented something like

public int strlen(char* string) {
        int length = 0;

        while (*string != '\0') { // *string is the current character
                string++; // increment the pointer to the next character
                length++;
        }

        return length;
}

// apologies if this code has an off-by-one error :(

or with array notation

public int strlen(char* string) {
        int length = 0;

        while(string[length] != '\0') {
                length++;
        }

        return length;
}
                

[There are shorter implementations, but these ones has less side-effects
and should be easy to follow.]

In C, constructions like string++ will quite happily run right of the
end of a allocated piece of memory - well quite happily until you get a
segfault, which may happen when you attempt to read from unallocated
memory and will definitely happen if you try to write to it.

So '\0' is the accepted signal for the end of a string, partly to avoid
segfaults, and partly because the string "hello" might be sitting in 512
bytes of allocated memory, and the terminating '\0' tells us that it's
only a 5 char string.

You don't *have* to use it, but you'd have to reimplement whatever
string methods you want to use, and come up with a similar system anyway
to solve the same problems.

-Mary



More information about the Programming mailing list