CP-1251(Win)  

Incorrectly designed, unreadable Russian pages

Paul Gorodyansky 'Cyrillic (Russian): instructions for Windows and Internet'

Sometimes you just can not read a Russian page, in spite of the fact that you have setup your browser correctly.
Such page is NOT readable at all on your PC with any browser - MS Internet Explorer, Netscape, WebSurfer, etc.

It means that you found a page where the author did some wrong things during development - explicitly named font and/or font size to be used for the reading of Russian text.
But in your system, for example, Windows 3.1, there is no Russian letters in the font with the specified name (you have Russian letters in another font) and you will see only gibberish symbols on such page.
If author does not specify a font name to be used then this page will be readable using your Russian fonts.

The author either included this feature himself or used a software that helped him, for example, MS Front Page.
This program includes such font information automatically, and the author did not remove it from the final version of the page.

We are talking about the "FONT" element of the HTML language.
If you take a look at the HTML text of such page, selecting from the menu
View / Document Source,
then you will see, for example, the following line before a Russian text:

  FONT FACE=Verdana    or   FONT FACE=Tahoma Size=1
As WWW professionals stated, usage of the elements FACE= and SIZE= is a bad HTML style, especially for non-Latin1 texts (languages that are not in the Western European group).
See WWW developers Newsgroup comp.infosystems.www.authoring.html and 2 well known articles about it:
"What's Wrong With FONT?"
and
"FONT FACE considered harmful".

The main point is that a user works with Russian successfully using his Russian fonts (in word processors, in browsers, etc.), while an author forces a browser to use the font he specified in the "FONT FACE=" element!

MS Front Page automatically inserts such HTML element...

Therefore, if a Web developer wants his page be readable for everyone (Internet is a network for everyone, right?), then s/he can spend 15 minutes in a text editor (HTML file is just a plain text file, similar to .TXT files) and remove "FONT FACE=" elements surrounding Russian text.

Beginners in the WWW development complain that they need it, because they want some specific look and feel of their page.
But This can be done in a safe way! They can use, instead of the outdated "FONT FACE=", a new mechanism called CSS (Cascading Style Sheets).
In CSS it is possible to declare a style of a font (for instance, Arial-like family of fonts), instead of a hard-coded font name and/or font size.

Below I listed the cases when HTML tag "FONT" makes a Russian page unreadable.

1. "FONT FACE=".

The simplest example - author surrounded his Russian text with
"FONT FACE=MS Sans Serif"   or   "FONT FACE=Verdana",
and in his Russian version of MS Windows everything is Okay.
But a user works under an English Windows, where listed fonts do not contain Russian letters, even if this user installs "MS Multilanguage Support" package.
(This package offers only 3 fonts - "Arial (Cyrillic)", "Times New Roman (Cyrillic)", and "Courier New (Cyrillic)").
Russian on such page will be unreadable.

Another example, not so obvious:
Author surrounded Russian text with a "FONT FACE=Arial" tag.
Some readers may not have Russian letters in this font.

In the following situations a Russian page will be unreadable if its HTML codecontain, for example,
  <FONT FACE="Arial">
surrounding Russian text (again, without such HTML tag this page is readable, because a reader gets a chance to use his Russian font - author does not insist on using the author's font):

  1. UNIX or Macintosh
    User does not have Russian letters in the font listed by the author in "FONT FACE=...", for example, "Arial".
    User has Russian letters in another font and browses successfully correctly designed Russian Web sites.

    Some Web developers think, that if they include also a name of a common UNIX Cyrillic font, then this will be readable on both UNIX and Windows. It's not correct. Russian Newsgroups Relcom.* and FIDO7* are full with complaints about unreadable pages, and here is one of them from a UNIX user (I translated it to English):

         > From - Tue Mar 10 10:45:00 1998
         > Newsgroups: fido7.ru.internet.www
         > Subject: Re: FONT FACE
    > 
    >> 2All ...
    >> ...
    >> Will you be able to read my Russian text if I add the names
    >> of some common UNIX Russian fonts, for example:
    >> "FONT FACE = Arial,  Helvetica, Geneva"?
    >> 
    >> I think, at least one of Russian fonts of the set of
    >>  Arial,Helvetica,Geneva is present in UNIX by default...
    >
    > Ok, I have on my UNIX machine Russian Helvetica and English Arial.
    > What do you think I will be able to see on your page that has
    >            font face="arial,helvetica"? :-)
    > Correct guess - only gibberish symbols.
        
  2. Netscape 3 under Windows NT 4.0.
    There are Russian letters in "Arial", but NT 4.0 fonts are UNICODE fonts, and Netscape 3 does not work with them correctly, it will not find Russian letters in "Arial".
    The page will be unreadable.
    Netscape 3 users usually work with non-UNICODE Russian fonts such as "ER Bukinist 1251" and if a page does not have "FONT FACE=...", then they can normally read Russian.

    Also, users who live outside of Russia, can get another problem - with Russian in forms, if they decide to fix Netscape (it's decribed in the chapter of my article mentioned above).

  3. English Windows 3.1 or 3.11
    There is a font "Arial", but it does not contain Russian letters!
    There is no such thing as Script-Cyrillic in Windows 3.x fonts.

    User of Windows 3.1/3.11 often work with non-Microsoft fonts, for example, "ER Bukinist 1251", and if a page does not have "FONT FACE=...", then they can normally read Russian.



That is, such developers narrow their customer base a lot, loose many potential readers/clients.
It is especially funny to see a commercial page of this kind - it should invite more customers instead of limiting their number, right?
See - if you can :) - an example of such page - "ITAR TASS information agency". As of the moment of writing this text, the problem on their page is still present...

You see, the Web is for everyone, therefore a developer should never make any assumptions about some user's computer and fonts (whether a font contains Cyrillic).

NOTE. In Netscape 4 there is an option that lets a user read such Russian page (but again, non-computer professionals may never find it, they just see a nonreadable page and leave it):
  Edit/Preferences/Appearance/Fonts
and there you need to select an option
  "use my default fonts, overriding document-specified font"

But this way the disables all styles (CSS) that are present on the good, correctly designed Web pages, it's not right.



I saw even more strange things when the author uses a program that helps him to build a Web page (f.e. MS Front Page).
This program automatically inserts HTML tags "FONT FACE=Arial", and then the author creates a new page, converting the text from Russian CP-1251(win) to Russian KOI8-R, that is, he prepares a KOI8-R variant of the same page.
If the author does not eliminate such tags, then this page will be unreadable, say, in Netscape 3 even under Russian versions of Windows where font "Arial" does contain Russian letters (nothing will help).
It's because this font belongs to the Russian CP-1251(win) encoding, while the text of this new page contains KOI8-R letters!
None of the standard MS Windows fonts contain Russian letters in the KOI8-R encoding.



2. "FONT SIZE=".

Sometimes the author did not write a font name for Russian text, but included a font size. As I mentioned, it is considered a bad style of the HTML, and such page is often not readable. On the author's PC everything is fine, because with his font this page is readable with, for example,
FONT SIZE=1 or FONT SIZE=-2.
But on a reader's PC, readers's font can NOT provide such sizing, and Russian text is not readable.
The point is the same - a developer must remember, that every user has his own set of fonts, and it is a mistake to make a presumption about a potential user's environment.
In the articles mentioned above, it is suggested to use HTML elements SMALL and BIG instead of SIZE=. They allow to control a size of a text on the screen.


Paul Gorodyansky. 'Cyrillic (Russian): instructions for Windows and Internet'