[Techtalk] umlauts

Shirrell shirrell at pstat.com
Tue Sep 2 13:09:34 EST 2003



We were sent a file that contains umlauts.
They have ascii values above 127.

A-umlaut is ascii 196 (octal 304).
O-umlaut is ascii 214 (octal 326).
U-umlaut is ascii 220 (octal 334), etc.

There seems to be little consistency in the way they are
represented on our 3 different platforms: solaris 8,
RedHat 8, and Windows XP .  

Questions:
(1) Can you find such a character in VI, or using GREP ?
    In RedHat vi the umlauts appear as the proper German
    characters.  In Solaris vi they appear with a back slash
    followed by the 3 octal numbers

(2) Do the Fortran CHAR and ICHAR functions object to
    values over 127 ?  This seems to be safe.  We have not
    run this on RedHat as the program is written in Fortran 90 and
    all I have is f77 on my linux machine.

(3) We read the file using a fortran read in A format,
    using f90 on a sun blade.
    A record containing O-umlaut SOMETIMES comes in
    as a single character ascii 214, as one would expect,
    and sometimes as 4 characters, \326.

    The A-umlaut and U-umlaut always come in as 1 character.
    Has anyone had effects like this ?

Are there any accepted conventions for these characters.

Shirrell



More information about the Techtalk mailing list