Reading and Creating Khmer Unicode Web Pages

VIEWING KHMER UNICODE WEB PAGES

Khmer Unicode text on Khmer Web pages does look very nice (much better than legacy Khmer text because it is fully ligatured and positioned) if viewed by a browser client with the proper software...but for practical (and/or legal) reasons...client machines (at this point in time) viewing them need the following:

(1) WindowsXP Operating System (the latest USP10.DLL [version 1.471.4030.0] reportedly works best with this). Lin Chear's work will probably make this available on Linux.

(2) KhmerOS.ttf (or other fonts from Danh Hong, Om Mony, or others) need to be added: Control Panel -> Fonts

(3) Microsoft Office 2003 (some applications [especially Microsoft Publisher] have been optimised to accept Khmer and this is the chief means of getting an up-to-date version of USP10.DLL; another is to sign a non-disclosure agreement and join the VOLT user community) and Internet Explorer.

(4) USP10.DLL needs to be placed in the same directory as the application that will be using it (in which case it overrides any other USP10.DLL in the system) or placed in C:/WINDOWS/system32/ (something that is quite difficult to do, see http://www.bauhahnm.clara.net/Khmer/usp10.html , but useful for all applications that could use this Uniscribe dll).

(5) If you see Khmer consonants but no subscripts or ligatures the likely problem is that a recent version of USP10.DLL is not available to your browser (or other application). The table below shows what to expect

 Raw Data (compare against columns on right) Appearance with Khmer OS font but without USP10.DLL Appearance with Khmer OS font & USP10.DLL
ថ្ងៃ​សៅរ៍, កុមារីខ្មែរ, កម្ពុជា    

CREATING KHMER UNICODE WEB PAGES

(1) Khmer Unicode characters may be entered into Web pages as NCR (ampersand # x hex_number ;) [without the spaces or parentheses], UCR (ampersand # decimal_number ;)[without the spaces or parentheses], or UTF-8. So the first Khmer consonant could be entered as ក or as ក (one could use the Calculator in Windows in Scientific view to convert between hex and decimal) The hex numbers for Khmer are available at http://www.unicode.org/charts/PDF/U1780.pdf and http://www.unicode.org/charts/PDF/U19E0.pdf I would suggest creating your Web page in the freeware BabelPad at the moment as it can convert the UTF-16 transformation format of Unicode common on Windows to the UTF-8 transformation format of Unicode common on the Web when you save the document. My Adobe Pagemill 3.0 does not handle the NCRs well.

(2) To key in UTF-16 you can use Keyman (http://www.tavultesoft.com/keyman/) and one of the layouts available on this site (KhmerKeymanKeyboard.zip).