TC Notes: A Message from the Editor #2

A routine check of the Netscape home page on Thursday, August 1, yielded an unexpected, though not unanticipated, discovery. The latest beta version of the Netscape Navigator browser now supports the display of non-Roman characters on Mac, Windows, and Unix platforms. In other words, you can now read TC articles with Hebrew, Greek, and Syriac words in their original scripts!

Netscape has once again supplemented the HTML standard, this time with a face attribute to their non-standard font tag. Those of us who were sorely disappointed when the original version of the font tag included only the ability to enlarge, reduce, and specify the vertical position of characters now have reason to sing the praises of the developers at Netscape. While font sizes, superscripts, and subscripts are indeed useful in an electronic journal that deals with textual criticism, these features pale in importance in comparison with the ability to display the languages of the biblical texts in their original scripts.

For those of you who are anxious to use this tag (hopefully in submissions to TC!), this is how it works:

The face attribute of the font tag specifies the font that will be loaded to display the following string of characters. All of the text that follows will be displayed in the specified font until the /font tag is encountered. In the example, the single word "Alfa" is displayed in Greek characters using the SPIonic font.

While the ability to display non-Roman characters is a tremendous advance, particularly for those of us interested in reading texts in their original languages and not transliterated, Netscape's implementation does not solve all of the problems faced by textual scholars. One problem that remains is the problem of lack of standards among fonts of a particular language. Greek fonts, for example, are plentiful, but where Greek abounds, diversity abounds even more. There is no standard Greek character map. In other words, while the character "b" probably represents the Greek letter beta in most fonts, how do you represent a smooth breathing mark combined with a circumflex accent? There are almost as many answers to this question as there are Greek fonts in existence. Unicode, a sixteen bit encoding scheme, offers a possible solution to the problem just mentioned. Because it offers a standard code point for each Greek character, accent, and breathing mark, all Unicode fonts will share the same character map. Unicode-compliant browsers will be able to read the HTML file, including any non-Roman characters, and display everything in the proper script, without the use of the font tag. One Unicode browser, Accent Mosaic, is currently on the market in beta form, although at present it only works on Windows machines. To solve the problem of multiple character maps, TC will use the public domain fonts available from the Scholars Press FTP site for all its articles. If you want to read the articles in the original scripts, you will have to download and install the proper fonts.

Another shortcoming with the current Netscape implementation is that right-to-left languages like Hebrew do not display properly across line breaks unless special precautions are taken. For example, if a two word Hebrew phrase is broken over two lines, the second word (the one on the left) will appear on the first line, while the first word will be on the second line. This problem can be overcome in one of two ways. For short phrases of only a few words, non-breaking spaces ( ) can be used to join words together. For example, the first three words of Genesis would be encoded as follows:

The non-breaking spaces force all three words to appear on the same line. This solution works fine for short phrases, but longer selections of Hebrew words, particularly multi-line quotations, must be handled differently. The way to display long quotations is to use a pre (preformatted) tag, as follows:

One obvious drawback to using either non-breaking spaces or pre tags is that long spaces may be left at the end of lines. pre tags are particularly problematic, since all the words between the tags appear on the screen as though they were a separate paragraph. Again, Unicode provides at least a partial solution to this problem, because right-to-left language processing is built into Unicode, so any Unicode-compliant browser will display Hebrew words properly. Unfortunately, the Unicode standard for Hebrew is not yet complete.

Unicode will solve some of the problems that remain with the display of non-Roman characters on the Web, but not all of them. SGML (Standard Generalized Markup Language) and Java offer other possible solutions, but there are no current implementations of either that do everything that textual scholars would like for them to do. The time is coming, however, when these and other display problems will be solved.

While awaiting Utopia, readers of TC can enjoy the benefits of Netscape's latest contribution to the Web community. For more than two years I myself have been impatiently waiting for the ability to display Hebrew and Greek (and Syriac, and Coptic) on the Web. I may not be in Eden yet, but I feel like I've at least reached the land of Nod, somewhere in the immediate vicinity.

Revision History

Date	Description
2010-11-12	Unicode URL changed from `http://www.stonehand.com/unicode.html` to `http://unicode.org/`; link to the Scholars Press FTP site (`ftp://scholar.cc.emory.edu/pub/fonts`) redirected to `http://www.sbl-site.org/educational/biblicalfonts.aspx`.

TC Notes: A Message from the Editor

Message 2 - August 5, 1996

Revision History