The client-server interation should look as follows. A client, when
initiating the request, should send HTTP_ACCEPT_LANGUAGE and
HTTP_ACCEPT_CHARSET variables to the server. Server sends, one of
the following:
Content-Type: text/html; charset=koi8-r
Content-Type: text/html; charset=windows-1251
Content-Type: text/html; charset=x-mac-cyrillic
Content-Type: text/html; charset=cp866
Content-Type: text/html; charset=iso-8859-5
This allows "smart" browsers (for example, later versions of Netscape
Communicator) to automatically switch fonts. Read more on this on
Andrei Chernov's
pages.
Having words like "please, choose an appropriate encoding" on your
pages is really a BAD idea, drives people crazy. When I see these
words, I really get mad on those who can not comply with simple rules,
to eliminate all this encoding mess. This is especially true for those
who have Microssoft made HTTP servers. This company, apparently think
they can do whatever is convinient for them, not for others.
Here is my advice. Get the
latest version of Apache
and a
FLY plug-in module
written by Igor Sereda (sereda@spb.runnet.ru).
The module allows on-the-fly recoding from one character set to another
on the basis of either HTTP_ACCEPT_CHARSET or, if it is not set, it scans
"User-Agent" field from which it tries to figure out what platform and OS you
are on. I am
archiving it on sunsite as well.
If you are using some other HTTP server, you are on your own. I would
advise you to get a real software, especially if you are using some
Microsoft stuff.
There might be different opinions about the recoding but until Unicode
is not universally supported, I believe, we have to live with it. Although,
my own opinion is that KOI8-R has to be the ONLY encoding for the Russian
web, for e-mail and news are transported in this encoding.
If your ISP can not provide you with the correct server configuration,
you might try to use the HTML tag. This will also tell
a smart browser to switch the fonts. Here is the example:
<Meta HTTP-EQUIV="Content-Type"
Content="text/html; charset=koi8-r">
However, this solution is very very very undesireble: it might interfere
with the caching proxy servers, so you are loosing potental corporate
clients that sit behind the firewall. If the proxy does recoding as well,
what happens is that the document gets recoded but the tag stays,
so you'll end up with the document that is impossible to see, unless
you save it on the local disk, delete and load to the browser again.
It is very unlikely that someone would want to do things like that.
This is especially painful when the charset is something other than KOI8-R,
CP1251, for example. Because the Unix version of Netscape (4.04, at least)
has a bug in CP1251 handling, you'll cut those users completely if you
write CP1251 in the tag. Many Windows based HTML editors are stupid
enough to write those tags, so PLEASE, PLEASE, PLEASE, always check
the code after you created it!