[Techtalk] umlauts

Malcolm Tredinnick malcolm at commsecure.com.au
Wed Sep 3 15:47:31 EST 2003


On Wed, 2003-09-03 at 08:21, Telsa Gwynne wrote:
> I am skipping the original questions because I don't know off-hand.
> Sorry.
[...]
> I generate these characters in vim on RH (9, but I think this=20
> worked with RH 8 too) with ^K followed by two characters. For all=20
> of your umlauted examples, the first character is the letter and
> the second character is a ":" mark.
>=20
> So ^Ku: produces =C3=BC, ^Ka: produces =C3=A4, and ^Ko: produces =C3=B6.
>=20
> Hit escape, hit / to start a search, and you can use the control-K,
> char1 char2 thing to generate your umlauted character and then=20
> search for it. And keep hitting 'n' to find the next one.=20
>=20
> I only started using vim because it understood utf8 and my old
> preferred editor didn't. I am glad vim understands utf8, because
> I don't.
>=20
> Other neat thing I discovered: put your cursor over one of these
> accented characters and hit g then 8. It tells you something about
> the character in the status line. I haven't quite worked out what
> it tells you, but I am sure I shall one day :)=20

It prints the UTF-8 encoding of that character. So the result when your
cursor is on top of =C3=BC is "c3 bc", which are the (hex) bytes of the UTF=
-8
representation.

> > www.unicode.org for lengthier explanations.  This mail, BTW, is coded
> > in UTF-8.
>=20
> Mutt thinks mine is in 8859-1. How do I make it claim UTF-8, or
> is that an editor issue?

Look at the send_charset variable which can be set in mutt's
configuration file. You can set it to a list of character set encodings
and mutt will choose the first one which can encode your mail without
losing information; by default, it tries us-ascii, then iso-8859-1, then
utf-8, which covers all bases without being unnecessarily obtuse.

Cheers,
Malcolm



More information about the Techtalk mailing list