UTF-8 SAMPLER
Â¥ · £ · ⬠· $ · ¢ · ⡠· ⢠· ⣠· ⤠· ⥠· ⦠· ⧠· ⨠· ⩠· ⪠· ⫠· â · ⮠· â¯
Frank da Cruz
The Kermit Project - Columbia University <index.html>
New York City
hide@address.com <mailto:hide@address.com>
/Last update:/ Sun Jun 12 20:24:10 2005
------------------------------------------------------------------------
[ PEACE <http://www.columbia.edu/~fdc/pace/> ] [ Poetry <#poetry> ] [ I
Can Eat Glass <#glass> ] [ The Quick Brown Fox <#quickbrownfox> ] [ HTML
Features <#html> ] [ Credits, Tools, Commentary <#credits> ]
UTF-8 is an ASCII-preserving encoding method for Unicode <unicode.html>
(ISO 10646), the Universal Character Set (UCS). The UCS encodes most of
the world's writing systems in a single character set, allowing you to
mix languages and scripts within a document without needing any tricks
for switching character sets. This web page is encoded directly in UTF-8.
As shown HERE <glass.html>, Columbia University's Kermit 95 <k95.html>
terminal emulation software can display UTF-8 plain text in Windows 95,
98, ME, NT, XP, or 2000 when using a monospace Unicode font like Andale
Mono WT J <http://www.monotype.com> or Everson Mono Terminal
<http://www.evertype.com/emono/>, or the lesser populated Courier New,
Lucida Console, or Andale Mono. C-Kermit <ckermit.html> can handle it
too, if you have a Unicode display
<http://www.cl.cam.ac.uk/~mgk25/unicode.html>. As many languages as are
representable in your font can be seen on the screen at the same time.
This, however, is a Web page. Some Web browsers can handle UTF-8, some
can't. And those that can might not have a sufficiently populated font
to work with (some browsers might pick glyphs dynamically from multiple
fonts; Netscape 6 seems to do this). CLICK HERE
<http://www.alanwood.net/unicode/fonts.html> for a survey of Unicode
fonts for Windows.
The subtitle above shows currency symbols of many lands. If they don't
appear as blobs, we're off to a good start!
------------------------------------------------------------------------
Poetry
From the Anglo-Saxon Rune Poem <http://www.ragweedforge.com/poems.html>
(Rune version):
á áá»á«áá¦á¦á«á á±á©á á¢á±á«á áá±áªá«á·áá»á¹á¦áá³á¢á
áá³ááªáá«á¦ááªá»á«ááªá¾á¾áªá«á·áá»á¹á¦áá³á«ááá³áá¢á¾á«á»á¦áá«áá«ááªá¾
á·áá á«á»áá«á¹áááá«á á©á±á«áá±áá»áá¾áá«áá©áááá«á»ááááªá¾á¬
From LaÈamon's/ Brut <http://mesl.itd.umich.edu/b/brut/>/ (/The
Chronicles of England/, Middle English, West Midlands):
An preost wes on leoden, LaÈamon was ihoten
He wes Leovenaðes sone -- liðe him be Drihten.
He wonede at ErnleÈe at æðelen are chirechen,
Uppen Sevarne staþe, sel þar him þuhte,
Onfest Radestone, þer he bock radde.
(The third letter in the author's name is Yogh, missing from many fonts;
CLICK HERE <st-erkenwald.html> for another Middle English sample with
some explanation of letters and encoding).
From the Tagelied of *Wolfram von Eschenbach*
<http://gutenberg.spiegel.de/autoren/eschenba.htm> (Middle High German):
Sîne klâwen durh die wolken sint geslagen,
er stîget ûf mit grôzer kraft,
ich sih in grâwen tägelîch als er wil tagen,
den tac, der im geselleschaft
erwenden wil, dem werden man,
den ich mit sorgen în verliez.
ich bringe in hinnen, ob ich kan.
sîn vil manegiu tugent michz leisten hiez.
Some lines of *Odysseus Elytis*
<http://users.hol.gr/~artemis/odysseas_elytis.htm> (Greek):
Τη γλÏÏÏα μοÏ
ÎδÏÏαν ελληνική
Ïο ÏÏίÏι ÏÏÏÏÎ¹ÎºÏ ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï
διÎÏ ÏοÏ
ÎμήÏοÏ
.
ÎονάÏη Îγνοια η γλÏÏÏα μοÏ
ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï
διÎÏ ÏοÏ
ÎμήÏοÏ
.
αÏÏ Ïο Îξιον ÎÏÏί
ÏοÏ
ÎδÏ
ÏÏÎα ÎλÏÏη
The first stanza of *Pushkin*
<http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm>'s
Bronze Horseman (Russian):
Ðа беÑÐµÐ³Ñ Ð¿ÑÑÑÑннÑÑ
волн
СÑоÑл он, дÑм великиÑ
полн,
Ð Ð²Ð´Ð°Ð»Ñ Ð³Ð»Ñдел. ÐÑед ним ÑиÑоко
Река неÑлаÑÑ; беднÑй ÑÑлн
Ðо ней ÑÑÑемилÑÑ Ð¾Ð´Ð¸Ð½Ð¾ÐºÐ¾.
Ðо мÑиÑÑÑм, Ñопким беÑегам
ЧеÑнели Ð¸Ð·Ð±Ñ Ð·Ð´ÐµÑÑ Ð¸ Ñам,
ÐÑиÑÑ Ñбогого ÑÑÑ
онÑа;
РлеÑ, неведомÑй лÑÑам
Ð ÑÑмане ÑпÑÑÑанного ÑолнÑа,
ÐÑÑгом ÑÑмел.
*Å ota Rustaveli*
<http://www.compling.hu-berlin.de/~johannes/mxedruli/>'s VepÌxis
TÌ£qÌaosani, ̣︡Th, The Knight in the Tiger's Skin (Georgian):
áááá®áá¡ á¢á§ááá¡ááá á¨ááá á á£á¡áááááá
á¦ááá áá¡á á¨áááááá á, áá£áᣠáááá áááá®á¡ááá¡ á¡áá¤ááá¡á á¨á áááá¡á, áªááªá®áá¡, á¬á§ááá¡á
áá ááá¬áá¡á, á°ááá áá áááá áá áááá¡á; ááááªááá¡ á¤á áááá áá áá¦áá¤á áááá,
áááá°á®ááá ááá¡ á©ááá¡á áááááá¡á, áá¦áá¡áá áá á¦áááá áá°á®áááááá áááá¡á áááááá
áá áááááá¡á.
Tamil poetry of Cupiramaniya Paarathiyar, à®à¯à®ªà¯à®°à®®à®£à®¿à®¯ பாரதியார௠(1882-1921):
யாமறிநà¯à®¤ à®®à¯à®´à®¿à®à®³à®¿à®²à¯ தமிழà¯à®®à¯à®´à®¿ பà¯à®²à¯ à®à®©à®¿à®¤à®¾à®µà®¤à¯ à®à®à¯à®à¯à®®à¯ à®à®¾à®£à¯à®®à¯,
பாமரராய௠விலà®à¯à®à¯à®à®³à®¾à®¯à¯, à®à®²à®à®©à¯à®¤à¯à®¤à¯à®®à¯ à®à®à®´à¯à®à¯à®à®¿à®à¯à®²à®ªà¯ பானà¯à®®à¯ à®à¯à®à¯à®à¯,
நாமமத௠தமிழரà¯à®©à®à¯ à®à¯à®£à¯à®à¯ à®à®à¯à®à¯ வாழà¯à®¨à¯à®¤à®¿à®à¯à®¤à®²à¯ நனà¯à®±à¯? à®à¯à®²à¯à®²à¯à®°à¯!
------------------------------------------------------------------------
I Can Eat Glass
And from the sublime to the ridiculous, here is a certain phrase¹
<#notes> in an assortment of languages:
1. *Sanskrit*: à¤à¤¾à¤à¤ शà¤à¥à¤¨à¥à¤®à¥à¤¯à¤¤à¥à¤¤à¥à¤®à¥ । नà¥à¤ªà¤¹à¤¿à¤¨à¤¸à¥à¤¤à¤¿ मामॠ॥
2. *Sanskrit* /(standard transcription):/ kÄcaá¹ Åaknomyattum;
nopahinasti mÄm.
3. *Classical Greek*: á½Î±Î»Î¿Î½ Ïαγεá¿Î½ δύναμαιΠÏοῦÏο οὠμε βλάÏÏει.
4. *Greek*: ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα γÏ
αλιά ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα.
*Etruscan*: (NEEDED)
5. *Latin*: Vitrum edere possum; mihi non nocet.
6. *Old French*: Je puis mangier del voirre. Ne me nuit.
7. *French*: Je peux manger du verre, ça ne me fait pas de mal.
8. *Provençal / Occitan*: Pòdi manjar de veire, me nafrariá pas.
9. *Québécois*: J'peux manger d'la vitre, ça m'fa pas mal.
10. *Walloon*: Dji pou magnî do vêre, çoula m' freut nén må.
*Champenois*: (NEEDED)
*Lorrain*: (NEEDED)
11. *Picard*: Ch'peux mingi du verre, cha m'foé mie n'ma.
*Corsican*: (NEEDED)
12. *Kreyòl Ayisyen*: Mwen kap manje vè, li pa blese'm.
13. *Basque*: Kristala jan dezaket, ez dit minik ematen.
14. *Catalan*: Puc menjar vidre que no em fa mal.
15. *Spanish*: Puedo comer vidrio, no me hace daño.
16. *Aragones*: Puedo minchar beire, no me'n fa mal .
17. *Galician*: Eu podo xantar cristais e non cortarme.
18. *Portuguese*: Posso comer vidro, não me faz mal.
19. *Brazilian Portuguese* (7 <#notes>): Posso comer vidro, não me
machuca.
20. *Caboverdiano*: M' podê cumê vidru, ca ta maguâ-m'.
21. *Papiamentu*: Ami por kome glas anto e no ta hasimi daño.
22. *Italian*: Posso mangiare il vetro e non mi fa male.
23. *Milanese*: Sôn bôn de magnà el véder, el me fa minga mal.
24. *Roman*: Me posso magna' er vetro, e nun me fa male.
25. *Napoletano*: M' pozz magna' o'vetr, e nun m' fa mal.
26. *Sicilian*: Puotsu mangiari u vitru, nun mi fa mali.
27. *Venetian*: Mi posso magnare el vetro, no'l me fa mae.
28. *Zeneise* /(Genovese):/ Pòsso mangiâ o veddro e o no me fà mâ.
*Rheto-Romance / Romansch*: (NEEDED)
*Romany / Tsigane*: (NEEDED)
29. *Romanian*: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte.
30. *Esperanto*: Mi povas manÄi vitron, Äi ne damaÄas min.
*Pictish*: (NEEDED)
*Breton*: (NEEDED)
31. *Cornish*: Mý a yl dybry gwéder hag éf ny wra ow ankenya.
32. *Welsh*: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
33. *Manx Gaelic*: Foddym gee glonney agh cha jean eh gortaghey mee.
34. *Old Irish* /(Ogham):/ ááááá
áááááááááááááá
ááá
ááááá
áá
35. *Old Irish* /(Latin):/ Con·iccim ithi nglano. NÃm·géna.
36. *Irish*: Is féidir liom gloinne a ithe. Nà dhéanann sà dochar ar
bith dom.
37. *Scottish Gaelic*: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
38. *Anglo-Saxon* /(Runes):/ áá³á«áá¨á·á«á·áá¨áá«áá©ááªá¾á«á©á¾áá«á»ááá«á¾áá«á»ááªá±áááªá§á«ááá¬
39. *Anglo-Saxon* /(Latin):/ Ic mæg glæs eotan ond hit ne hearmiað me.
40. *Middle English*: Ich canne glas eten and hit hirtiþ me nouÈt.
41. *English*: I can eat glass and it doesn't hurt me.
42. *English* /(IPA):/ [aɪ kæn iËt glÉËs ænd ɪt dÉz nÉt hÉËt miË]
(Received Pronunciation)
43. *English* /(Braille):/ â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â ¥â â â â â
44. *Lalland Scots / Doric*: Ah can eat gless, it disnae hurt us.
*Glaswegian*: (NEEDED)
45. *Gothic* (4 <#notes>): ð¼ð°ð² ð²ð»ð´ð ð¹Ìðð°ð½, ð½ð¹ ð¼ð¹ð ð
ð¿
ð½ð³ð°ð½ ð±ðð¹ð²ð²ð¹ð¸.
46. *Old Norse* /(Runes):/ áá´ á·áá ááá á§ á·ááá± áá¾ á¦ááá á¨á§ á¡á á±á§á¨ áá¨á±
47. *Old Norse* /(Latin):/ Ek get etið gler án þess að verða sár.
48. *Norsk / Norwegian (Nynorsk):* Eg kan eta glas utan å skada meg.
49. *Norsk / Norwegian (Bokmål):* Jeg kan spise glass uten å skade meg.
*Føroyskt / Faroese*: (NEEDED)
50. *Ãslenska / Icelandic*: Ãg get etið gler án þess að meiða mig.
51. *Svenska / Swedish*: Jag kan äta glas utan att skada mig.
52. *Dansk / Danish*: Jeg kan spise glas, det gør ikke ondt på mig.
53. *Soenderjysk*: à ka æe glass uhen at det go mæ naue.
54. *Frysk / Frisian*: Ik kin glês ite, it docht me net sear.
55. *Nederlands / Dutch*: Ik kan glas eten, het doet mij geen kwaad.
56. *Kirchröadsj/Bôchesserplat*: Iech ken glaas èèse, mer 't deet
miech jing pieng.
57. *Afrikaans*: Ek kan glas eet, maar dit doen my nie skade nie.
58. *Lëtzebuergescht / Luxemburgish*: Ech kan Glas iessen, daat deet
mir nët wei.
59. *Deutsch / German*: Ich kann Glas essen, ohne mir weh zu tun.
60. *Ruhrdeutsch*: Ich kann Glas verkasematuckeln, ohne dattet mich
wat jucken tut.
61. *Lausitzer Mundart* ("Lusatian"): Ich koann Gloos assn und doas
dudd merr ni wii.
62. *Odenwälderisch*: Iech konn glaasch voschbachteln ohne dass es mir
ebbs daun doun dud.
63. *Sächsisch / Saxon*: 'sch kann Glos essn, ohne dass'sch mer wehtue.
64. *Pfälzisch*: Isch konn Glass fresse ohne dasses mer ebbes ausmache
dud.
65. *Schwäbisch / Swabian*: I kå Glas frässa, ond des macht mr nix!
66. *Bayrisch / Bavarian*: I koh Glos esa, und es duard ma ned wei.
67. *Allemannisch*: I kaun Gloos essen, es tuat ma ned weh.
68. *Schwyzerdütsch*: Ich chan Glaas ässe, das tuet mir nöd weeh.
69. *Hungarian*: Meg tudom enni az üveget, nem lesz tÅle bajom.
70. *Suomi / Finnish*: Voin syödä lasia, se ei vahingoita minua.
71. *Sami (Northern)*: Sáhtán borrat lása, dat ii leat bávÄÄas.
72. *Erzian*: Ðон ÑÑÑан ÑÑликадо, Ð´Ñ Ð·ÑÑн ÑйÑÑÑÐ½Ð·Ñ Ð° Ñли.
*Karelian*: (NEEDED)
*Vepsian*: (NEEDED)
*Votian*: (NEEDED)
*Livonian*: (NEEDED)
73. *Estonian*: Ma võin klaasi süüa, see ei tee mulle midagi.
74. *Latvian*: Es varu Äst stiklu, tas man nekaitÄ.
75. *Lithuanian*: AÅ¡ galiu valgyti stiklÄ
ir jis manÄs nežeidžia
*Old Prussian*: (NEEDED)
*Sorbian* (Wendish): (NEEDED)
76. *Czech*: Mohu jÃst sklo, neublÞà mi.
77. *Slovak*: Môžem jesť sklo. Nezranà ma.
78. *Polska / Polish*: MogÄ jeÅÄ szkÅo i mi nie szkodzi.
79. *Slovenian:* Lahko jem steklo, ne da bi mi škodovalo.
80. *Croatian*: Ja mogu jesti staklo i ne boli me.
81. *Serbian* /(Latin):/ Mogu jesti staklo a da mi ne škodi.
82. *Serbian* /(Cyrillic):/ ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло а да ми не Ñкоди.
83. *Macedonian:* Ðожам да Ñадам ÑÑакло, а не ме ÑÑеÑа.
84. *Russian*: Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, оно мне не вÑедиÑ.
85. *Belarusian* /(Cyrillic):/ Я Ð¼Ð°Ð³Ñ ÐµÑÑÑ Ñкло, Ñно мне не ÑкодзÑÑÑ.
86. *Belarusian* /(Lacinka):/ Ja mahu jeÅci Å¡kÅo, jano mne ne Å¡kodziÄ.
87. *Ukrainian*: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ.
88. *Bulgarian*: Ðога да Ñм ÑÑÑкло, Ñо не ми вÑеди.
89. *Georgian*: ááááá¡ áááá áá áá á áá¢áááá.
90. *Armenian*: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ
91. *Albanian*: Unë mund të ha qelq dhe nuk më gjen gjë.
92. *Turkish*: Cam yiyebilirim, bana zararı dokunmaz.
93. *Turkish* /(Ottoman):/ جاÙ
ÙÙ٠بÙÙØ±Ù
Ø¨ÚØ§ ضرر٠طÙÙÙÙÙ
ز
94. *Bangla / Bengali*: à¦à¦®à¦¿ à¦à¦¾à¦à¦ à¦à§à¦¤à§ পারি, তাতৠà¦à¦®à¦¾à¦° à¦à§à¦¨à§ à¦à§à¦·à¦¤à¦¿ হৠনা।
95. *Marathi*: मॠà¤à¤¾à¤ à¤à¤¾à¤ शà¤à¤¤à¥, मला तॠदà¥à¤à¤¤ नाहà¥.
96. *Hindi*: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥.
97. *Tamil*: நான௠à®à®£à¯à®£à®¾à®à®¿ à®à®¾à®ªà¯à®ªà®¿à®à¯à®µà¯à®©à¯, à®
தனால௠à®à®©à®à¯à®à¯ à®à®°à¯ à®à¯à®à¯à®®à¯ வராதà¯.
98. *Urdu*(2) <#notes>: Ù
ÛÚº کاÙÚ Ú©Ú¾Ø§ سکتا ÛÙÚº Ø§ÙØ± Ù
Ø¬Ú¾Û ØªÚ©ÙÛÙ ÙÛÛÚº ÛÙØªÛ Û
99. *Pashto*(2) <#notes>: Ø²Ù Ø´ÙØ´Ù Ø®ÙÚÙÛ Ø´Ù
Ø ÙØºÙ Ù
ا ÙÙ Ø®ÙÚÙÙ
100. *Farsi / Persian*: .Ù
Ù Ù
Û ØªÙØ§ÙÙ
بدÙÙÙ Ø§ØØ³Ø§Ø³ درد Ø´ÙØ´Ù Ø¨Ø®ÙØ±Ù
101. *Arabic*(2) <#notes>: Ø£ÙØ§ ÙØ§Ø¯Ø± عÙ٠أÙÙ Ø§ÙØ²Ø¬Ø§Ø¬ Ù ÙØ°Ø§ ÙØ§ ÙØ¤ÙÙ
ÙÙ.
*Aramaic*: (NEEDED)
102. *Hebrew*(2) <#notes>: ×× × ×××× ××××× ×××××ת ××× ×× ××××§ ××.
103. *Yiddish*(2) <#notes>: ××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².
*Judeo-Arabic*: (NEEDED)
*Ladino*: (NEEDED)
*GÇʼÇz*: (NEEDED)
*Amharic*: (NEEDED)
104. *Twi*: Metumi awe tumpan, ÉnyÉ me hwee.
105. *Hausa* (/Latin/): InaÌ iya taunar gilaÌshi kuma in gamaÌ laÌfiyaÌ.
106. *Hausa* (/Ajami/) (2) <#notes>: Ø¥ÙÙØ§ Ø¥ÙÙ٠تÙÙÙÙØ± غÙÙÙØ§Ø´Ù ÙÙÙ
٠إÙ٠غÙÙ
ÙØ§ ÙÙØ§ÙÙÙÙØ§
107. *Yoruba*(3) <#notes>: Mo lè jeÌ© dÃgÃ, kò nà pa mà lára.
108. *(Ki)Swahili*: Naweza kula bilauri na sikunyui.
109. *Malay*: Saya boleh makan kaca dan ia tidak mencederakan saya.
110. *Tagalog*: Kaya kong kumain nang bubog at hindi ako masaktan.
111. *Chamorro*: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
112. *Javanese*: Aku isa mangan beling tanpa lara.
*Burmese*: (NEEDED)
113. *Vietnamese (quá»c ngữ)*: Tôi có thá» Än thá»§y tinh mà không hại gì.
114. *Vietnamese (nôm)* (4 <#notes>): äº ð£ ä¸ å¹ æ°´ æ¶ ð¦¡ ç©º ð£ 害 å¦
*Khmer*: (NEEDED)
*Lao*: (NEEDED)
115. *Thai*: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸
116. *Mongolian* /(Cyrillic):/ Ðи Ñил идÑй Ñадна, надад Ñ
оÑÑой биÑ
117. *Mongolian* /(Classic) (5 <#notes>):/ á ªá ¢ á °á ¢á ¯á ¢ á ¢á ³á ¡á ¶á ¦ á ´á ¢á ³á á ¨á á á ¨á á ³á ¤á ·
á ¬á £á ¤á ·á á ³á á ¢ á ªá ¢á °á ¢
*Dzongkha*: (NEEDED)
*Nepali*: (NEEDED)
118. *Tibetan*: ཤེལà¼à½¦à¾à½¼à¼à½à¼à½à½¦à¼à½à¼à½à¼à½à½²à¼à½à¼à½¢à½ºà½à¼
119. *Chinese*: æè½åä¸ç»çèä¸ä¼¤èº«ä½ã
120. *Chinese* (Traditional): æè½åä¸ç»çèä¸å·èº«é«ã
121. *Taiwanese*(6) <#notes>: Góa Ä-tà ng chiaÌh po-lê, mÄ bÄ tioÌh-siong.
122. *Japanese*: ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã
123. *Korean*: ëë ì 리를 먹ì ì ìì´ì. ê·¸ëë ìíì§ ììì
124. *Bislama*: Mi save kakae glas, hemi no save katem mi.
125. *Hawaiian*: Hiki iaÊ»u ke Ê»ai i ke aniani; Ê»aÊ»ole nÅ lÄ au e Ê»eha.
126. *Marquesan*: E koÊ»ana e kai i te karahi, mea Ê»Ä, Ê»aÊ»e hauhau.
127. *Chinook Jargon:* Naika mÉkmÉk kakshÉt labutay, pi weyk ukuk
munk-sik nay.
128. *Navajo*: Tsésǫʼ yishÄ
ÌÄ
go bÃÃnÃshghah dóó doo shiÅ neezgai da.
*Cherokee* /(and Cree, Ojibwa, Inuktitut, and other Native
American languages):/ (NEEDED)
*Garifuna*: (NEEDED)
*Gullah*: (NEEDED)
129. *Lojban*: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
130. *Nórdicg*: LjÅr ye caudran créneþ ý jor cáºran.
/(Additions, corrections, completions,/ /gratefully accepted/
<mailto:hide@address.com>/.)/
For testing purposes, some of these are repeated in a *monospace
font* . . .
1. Euro Symbol: â¬.
2. Greek: ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα γÏ
αλιά ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα.
3. Ãslenska / Icelandic: Ãg get etið gler án þess að meiða mig.
4. Polish: MogÄ jeÅÄ szkÅo, i mi nie szkodzi.
5. Romanian: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte.
6. Ukrainian: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ.
7. Armenian: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ
8. Georgian: ááááá¡ áááá áá áá á áá¢áááá.
9. Hindi: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥.
10. Hebrew(2) <#notes>: ×× × ×××× ××××× ×××××ת ××× ×× ××××§ ××.
11. Yiddish(2) <#notes>: ××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².
12. Arabic(2) <#notes>: Ø£ÙØ§ ÙØ§Ø¯Ø± عÙ٠أÙÙ Ø§ÙØ²Ø¬Ø§Ø¬ Ù ÙØ°Ø§ ÙØ§ ÙØ¤ÙÙ
ÙÙ.
13. Japanese: ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã
14. Thai: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸
*Notes:*
1. The "I can eat glass" phrase and initial translations (about 30 of
them) were borrowed from Ethan Mollick's I Can Eat Glass
<http://hcs.harvard.edu/~igp/glass.html> page (which disappeared
on or about June 2004) and converted to UTF-8. Since Ethan's
original page is gone, I should mention that his purpose was offer
travelers a phrase they could use in any country that would
command a certain kind of respect, or at least get attention. See
Credits <#credits> for the many additional contributions since
then. When submitting new entries, the word "hurt" (if you have a
choice) is used in the sense of "cause harm", "do damage", or
"bother", rather than "inflict pain" or "make sad". In this vein
Otto Stolz comments (as do others further down; personally I think
it's better for the purpose of this page to have extra entries
and/or to show a greater repertoire of characters than it is to
enforce a strict interpretation of the word "hurt"!):
This is the meaning I have translated to the Swabian dialect.
However, I just have noticed that most of the German variants
translate the "inflict pain" meaning. The German example
should rather read:
"Ich kann Glas essen ohne mir zu schaden."
(The comma fell victim to the 1996 orthographic reform, cf.
http://www.ids-mannheim.de/reform/e3-1.html#P76.
You may wish to contact the contributors of the following
translations to correct them:
* Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen,
daat deet mir nët wei.
* Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und
doas dudd merr ni wii.
* Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch
mer wehtue.
* Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned
wei.
* Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
* Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
In contrast, I deem the following translations *alright*:
* Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet
mich wat jucken tut.
* Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes
ausmache dud.
* Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr
nix!
(However, you could remove the commas, on account of
http://www.ids-mannheim.de/reform/e3-1.html#P76 and
http://www.ids-mannheim.de/reform/e3-1.html#P72, respectively.)
I guess, also these examples translate the /wrong/ sense of
"hurt", though I do not know these languages well enough to
assert them definitely:
* Nederlands / Dutch: Ik kan glas eten; het doet mij geen
pijn. /(This one has been changed)/
* Kirchröadsj/Bôchesserplat: Iech ken glaas èèse, mer 't
deet miech jing pieng.
In the Romanic languages, the variations on "fa male" (it) are
probably wrong, whilst the variations on "hace daño" (es) and
"damaÄas" (Esperanto) are probably correct; "nocet" (la) is
definitely right.
The northern Germanic variants of "skada" are probably right,
as are the Slavic variants of "Å¡kodi/Ñкоди" (se); however the
Slavic variants of " boli" (hv) are probably wrong, as
"bolena" means "pain/ache", IIRC.
The numbering of the samples is arbitrary, done only to keep track
of how many there are, and can change any time a new entry is
added. The arrangement is also arbitrary but with some attempt to
group related examples together. Note: All languages not listed
are wanted, not just the ones that say (NEEDED).
2. Correct right-to-left display of these languages depends on the
capabilities of your browser. The period should appear on the
left. In the monospace Yiddish example, the Yiddish digraphs
should occupy one character cell.
3. Yoruba: The third word is Latin letter small 'j' followed by small
'e' with U+0329, Combining Vertical Line Below. This displays
correctly only if your Unicode font includes the U+0329 glyph and
your browser supports combining diacritical marks. The Indic
examples also include combining sequences.
4. Includes Unicode 3.1 (or later) characters beyond Plane 0.
5. The Classic Mongolian example should be vertical, top-to-bottom
and left-to-right. But such display is almost impossible. Also no
font yet exists which provides the proper ligatures and positional
variants for the characters of this script, which works somewhat
like Arabic.
6. Taiwanese is also known as Holo or Hoklo, and is related to
Southern Min dialects such as Amoy. Contributed by Henry H.
Tan-Tenn, who comments, "The above is the romanized version, in a
script current among Taiwanese Christians since the mid-19th
century. It was invented by British missionaries and saw use in
hundreds of published works, mostly of a religious nature. Most
Taiwanese did not know Chinese characters then, or at least not
well enough to read. More to the point, though, a written standard
using Chinese characters has never developed, so a significant
minority of words are represented with different candidate
characters, depending on one's personal preference or etymological
theory. In this sentence, for example, "-tà ng", "chiaÌh", "mÄ" and
"bÄ" are problematic using Chinese characters. "Góa" (I/me) and
"po-lê" (glass) are as written in other Sinitic languages (e.g.
Mandarin, Hakka)."
7. Wagner Amaral of Pinese & Amaral Associados notes that the
Brazilian Portuguese sentence for "I can eat glass" should be
identical to the Portuguese one, as the word "machuca" means
"inflict pain", or rather "injuries". The words "faz mal" would
more correctly translate as "cause harm".
------------------------------------------------------------------------
The Quick Brown Fox
The "I can eat glass" sentences do not necessarily show off the
orthography of each language to best advantage. In many alphabetic
written languages it is possible to include all (or most) letters (or
"special" characters) in a single (often nonsense) /pangram/. These were
traditionally used in typewriter instruction; now they are useful for
stress-testing computer fonts and keyboard input methods. Here are a few
examples (SEND MORE):
1. *English:* The quick brown fox jumps over the lazy dog.
2. *Irish:* "An á¸fuil do Äroà ag bualaḠó á¸aitÃos an Ä¡rá a á¹eall lena
á¹Ã³g éada ó ṡlà do leasa ṫú?" "D'á¸uascail Ãosa Ãrá¹ac na hÃiÄ¡e
Beannaiṫe pór Ãava agus Ãá¸aiá¹."
3. *Dutch:* Pa's wijze lynx bezag vroom het fikse aquaduct.
4. *German: * Falsches Ãben von Xylophonmusik quält jeden gröÃeren
Zwerg. (1)
5. *German: * Im finſteren Jagdſchloà am offenen Felsquellwaſſer
patzte der affig-flatterhafte kauzig-höfâliche Bäcker über Å¿einem
verſifften kniffligen C-Xylophon. (2)
6. *Swedish:* Flygande bäckasiner söka strax hwila på mjuka tuvor.
7. *Czech:* PÅÃliÅ¡ žluÅ¥ouÄký kůŠúpÄl Äábelské kódy.
8. *Slovak:* Starý kôŠna hÅbe knÃh žuje tÃÅ¡ko povädnuté ruže, na
stĺpe sa Äateľ uÄà kvákaÅ¥ novú ódu o živote.
9. *Russian:* Ð ÑаÑаÑ
Ñга жил-бÑл ÑиÑÑÑÑ? Ðа, но ÑалÑÑивÑй ÑкземплÑÑ!
ÑÑ.
10. *Bulgarian:* ÐÑлÑаÑа дÑÐ»Ñ Ð±ÐµÑе ÑаÑÑлива, Ñе пÑÑ
ÑÑ, койÑо ÑÑÑна,
замÑÑзна каÑо гÑон.
11. *Sami (Northern):* Vuol Ruoŧa geÄggiid leat máÅga luosa ja Äuovžža.
12. *Hungarian:* ÃrvÃztűrÅ tükörfúrógép.
13. *Spanish:* El pingüino Wenceslao hizo kilómetros bajo exhaustiva
lluvia y frÃo, añoraba a su querido cachorro.
14. *Portuguese:* O próximo vôo à noite sobre o Atlântico, põe
freqüentemente o único médico. (3)
15. *French:* Les naïfs ægithales hâtifs pondant à Noël où il gèle
sont sûrs d'être déçus et de voir leurs drôles d'Åufs abîmés.
16. *Esperanto:* EÄ¥oÅanÄo ÄiuĵaÅde.
17. *Hebrew:* ×× ×××£ ×¡×ª× ×ש×××¢ ××× ×ª× ×¦× ×§×¨×¤× ×¢×¥ ××× ×××.
18. *Japanese* (Hiragana):
ããã¯ã«ã»ã¸ã©ãã¡ãã¬ãã
ãããããããã¤ããªãã
ããã®ãããã¾ãããµããã¦
ãããããã¿ãããã²ããã (4)
*Notes:*
1. Other phrases commonly used in Germany include: "Ein wackerer
Bayer vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently,
"Franz jagt im komplett verwahrlosten Taxi quer durch Bayern", but
both lack umlauts and esszet. Previously, going for the shortest
sentence that has all the umlauts and special characters, I had
"GrüÃe aus Bärenhöfe (und Ãechtringen)!" Acute accents are not
used in native German words, so I was surprised to discover
"Ãechtringen" in the Deutsche Bundespost Postleitzahlenbuch
<http://www.columbia.edu/~fdc/misc/oechtringen.jpg> (Vorsicht!
2.8MB JPG image). It's a small village in eastern Lower Saxony.
The "oe" in this case turns out to be the Lower Saxon "lengthening
e" (Dehnungs-e), which makes the previous vowel long (used in a
number of Lower Saxon place names such as Soest and Itzehoe), not
the "e" that indicates umlaut of the preceding vowel. Many thanks
to the Ãechtringen-Namenschreibungsuntersuchungskomitee (Alex
Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus
Werner Lemberg who serves as the
Ãechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer)
for their relentless pursuit of the facts in this case.
Conclusion: the accent almost certainly does not belong on this
(or any other native German) word, but neither can it be dismissed
as dirt on the page. To add to the mystery, it has been reported
that other copies of the same edition of the PLZB do not show the
accent!
2. From Karl Pentzlin (Kochel am See, Bavaria, Germany): "This German
phrase is suited for display by a Fraktur (broken letter) font. It
contains: all common three-letter ligatures: ffi ffl fft and all
two-letter ligatures required by the Duden for Fraktur
typesetting: ch ck ff fi fl ft ll ſch ſi ſſ ſt tz (all in a manner
such they are not part of a three-letter ligature), one example of
f-l where German typesetting rules prohibit ligating (marked by a
ZWNJ), and all German letters a...z, ä,ö,ü,Ã, Å¿ [long s] (all in a
manner such that they are not part of a two-letter Fraktur
ligature)." Otto Stolz notes that "'SchloÃ' is now spelled
'Schloss', in contrast to 'gröÃer' (example 4) which has kept its
'Ã'. Fraktur has been banned from general use, in 1942, and long-s
(Å¿) has ceased to be used with Antiqua (Roman) even earlier (the
latest Antiqua-Å¿ I have seen is from 1913, but then I am no
expert, so there may well be a later instance." Later Otto
confirms the latter theory, "Now I've run across a book âDeutsche
Rechtschreibungâ (edited by Lutz Mackensen) from 1954 (my reprint
is from 1956) that has kept the Antiqua-Å¿ in its dictionary part
(but neither in the preface nor in the appendix)."
3. Diaeresis is not used in Iberian Portuguese.
4. From Yurio Miyazawa: "This poetry contains all the sounds in the
Japanese language and used to be the first thing for children to
learn in their Japanese class. The Hiragana version is
particularly neat because it covers every character in the
phonetic Hiragana character set." Yurio also sent the Kanji version:
è²ã¯åã¸ã© æ£ãã¬ãã
æãä¸èª°ã 常ãªãã
æçºã®å¥¥å±± 仿¥è¶ãã¦
æµ
ã夢è¦ã é
ã²ããã
*Accented Cyrillic:*
/(This section contributed by Vladimir Marinov.)/
In Bulgarian it is desirable, customary, or in some cases required to
write accents over vowels. Unfortunately, no computer character sets
contain the full repertoire of accented Cyrillic letters. With Unicode,
however, it is possible to combine any Cyrillic letter with any
combining accent. The appearance of the result depends on the font and
the rendering engine. Here are two examples.
1. Той Ð²Ð¸Ð´Ñ Ð±ÑлаÑа коÑÐ°Ì Ð¿Ð¾ главаÑа Ð¸Ì Ð¸ коÌÑа на ÑамоÑо иÌ, и ÑеÌÑе да иÌ
ÑеÑеÌ: "ÐаÑаÌÑа Ð¿Ð¾Ì Ð¿Ð°ÌÑи Ð¾Ñ Ð¿Ð°ÌÑаÑа, не Ñа паÑиÌ!", но Ñи помиÌÑли:
"Хей, помиÑÐ»Ð¸Ì Ñи! ÐÌ Ð¸Ì Ñека, Ð°Ì Ðµ ÑкоÑила в Ñази Ñека, коÑÑо ÑеÑе да
ÑеÑеÌ, а не ÑеÌÑе."
2. Ðо пÑÌÑÑ Ð¿ÑÑÑÌÐ²Ð°Ñ ÐºÑÌÑди и ÑгоÑлавÑÌни.
------------------------------------------------------------------------
HTML Features
Here is the Russian alphabet (uppercase only) coded in three different
ways, which should look identical:
1. ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУФХЦЧШЩЪЫЬÐЮЯ /(Literal UTF-8)/
2. ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУФХЦЧШЩЪЫЬÐЮЯ /(Decimal numeric character
reference)/
3. ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУФХЦЧШЩЪЫЬÐЮЯ /(Hexadecimal numeric character
reference)/
In another test, we use HTML language tags to distinguish Bulgarian,
Russian, and Serbian
<http://www.tiro.com/transfer/Serbian_Rendering.pdf>, which have
different italic forms for lowercase б, г, д, п, and/or Ñ:
*Bulgarian*: [ Ð±Ð³Ð´Ð¿Ñ ] [ /бгдпÑ/ ] / Ðога да Ñм ÑÑÑкло и не
ме боли./
*Russian*: [ Ð±Ð³Ð´Ð¿Ñ ] [ /бгдпÑ/ ] /Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, ÑÑо мне
не вÑедиÑ./
*Serbian*: [ Ð±Ð³Ð´Ð¿Ñ ] [ /бгдпÑ/ ] /ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло а да ми
не Ñкоди./
------------------------------------------------------------------------
Credits, Tools, and Commentary
*Credits:*
The "I can eat glass" phrase and the initial collection of
translations: Ethan Mollick
<http://hcs.harvard.edu/~igp/glass.html>. Transcription / conversion
to UTF-8: Frank da Cruz. *Albanian:* Sindi Keesan. *Afrikaans:*
Johan Fourie, Kevin Poalses. *Anglo Saxon:* Frank da Cruz. *Arabic:*
Najib Tounsi. *Armenian:* Vaçe Kundakçı. *Belarusian:* Alexey
Chernyak. *Bengali:* Somnath Purkayastha, Deepayan Sarkar.
*Bislama:* Dan McGarry. *Braille:* Frank da Cruz. *Bulgarian:* Sindi
Keesan, Guentcho Skordev, Vladimir Marinov. *Cabo Verde Creole:*
Cláudio Alexandre Duarte. *Chinese:* Jack Soo, Wong Pui Lam.
*Chinook Jargon:* David Robertson. *Cornish:* Chris Stephens.
*Croatian:* Marjan BaÄe. *Czech:* Stanislav Pecha, Radovan GarabÃk.
*Dutch:* Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit.
*Erzian:* Jack Rueter. *Esperanto:* Franko Luin, Radovan GarabÃk.
*Estonian:* Meelis Roos. *Farsi/Persian:* Payam Elahi. *Finnish:*
Sampsa Toivanen. *French:* Luc Carissimo, Anne Colin du Terrail,
Sean M. Burke. *Galician:* Laura Probaos. *Georgian:* Giorgi
Lebanidze. *German:* Christoph Päper, Otto Stolz, Karl Pentzlin,
Frank da Cruz. *Gothic:* Aurélien Coudurier. *Greek:* Ariel Glenn,
Constantine Stathopoulos, Siva Nataraja. *Hebrew:* Jonathan Rosenne,
Tal Barnea. *Hausa:* Malami Buba, Tom Gewecke. *Hawaiian:* na
Hauʻoli Motta, Anela de Rego, Kaliko Trapp. *Hindi:* Shirish Kalele.
*Hungarian:* András Rácz, Mark Holczhammer. *Icelandic:* Andrés
Magnússon. *International Phonetic Alphabet (IPA):* Siva Nataraja /
Vincent Ramos. *Irish:* Michael Everson, Marion Gunn, James Kass,
Curtis Clark. *Italian:* Thomas De Bellis. *Japanese:* Makoto
Takahashi, Yurio Miyazawa. *Kirchröadsj:* Roger Stoffers. *Kreyòl:*
Sean M. Burke. *Korean:* Jungshik Shin. *Lëtzebuergescht:* Stefaan
Eeckels. *Lithuanian:* Gediminas Grigas. *Lojban:* Edward Cherlin.
*Lusatian:* Ronald Schaffhirt. *Macedonian:* Sindi Keesan. *Malay:*
Zarina Mustapha. *Manx:* Ãanna à Brádaigh. *Marathi:* Shirish
Kalele. *Marquesan:* Kaliko Trapp. *Middle English:* Frank da Cruz.
*Milanese:* Marco Cimarosti. *Mongolian:* Tom Gewecke. *Napoletano:*
Diego Quintano. *Navajo:* Tom Gewecke. *Nórdicg*
<http://www.langmaker.com/db/mdl_nordicg.htm>: Yáºlyan Rott.
*Norwegian:* Herman Ranes. *Odenwälderisch:* Alexander HeÃ. *Old
Irish:* Michael Everson. *Old Norse:* Andrés Magnússon.
*Papiamentu:* Bianca and Denise Zanardi. *Pashto:* N.R. Liwal.
*Pfälzisch:* Dr. Johannes Sander. *Picard:* Philippe Mennecier.
*Polish:* Juliusz Chroboczek. *Portuguese:* "Cláudio" Alexandre
Duarte, Bianca and Denise Zanardi, Pedro Palhoto Matos, Wagner
Amaral. *Québécois:* Laurent Detillieux. *Roman:* Pierpaolo
Bernardi. *Romanian:* Juliusz Chroboczek, Ionel Mugurel.
*Ruhrdeutsch:* "Timwi". *Russian:* Alexey Chernyak, Serge
Nesterovitch. *Sami:* Anne Colin du Terrail, Luc Carissimo.
*Sanskrit:* Siva Nataraja / Vincent Ramos. *Sächsisch:* André
Müller. *Schwäbisch:* Otto Stolz. *Scots:* Jonathan Riddell.
*Serbian:* Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia
Csorba. *Slovak:* G. Adam Stanislav, Radovan GarabÃk. *Slovenian:*
Albert Kolar. *Spanish:* Aleida Muñoz
<http://www.panix.com/~aleida>, Laura Probaos. *Swahili:* Ronald
Schaffhirt. *Swedish:* Christian Rose, Bengt Larsson. *Taiwanese:*
Henry H. Tan-Tenn. *Tagalog:* Jim Soliven. *Tamil:* Vasee
Vaseeharan. *Tibetan:* D. Germano, Tom Gewecke. *Thai:* Alan Wood's
wife. *Turkish:* Vaçe Kundakçı, Tom Gewecke, Merlign Olnon.
*Ukrainian:* Michael Zajac. *Urdu:* Mustafa Ali. *Vietnamese*
<http://nomfoundation.org/>: Dixon Au, [James] Äá» Bá Phưá»c æ 伯 ç¦.
*Walloon:* Pablo Saratxaga. *Welsh:* Geiriadur Prifysgol Cymru
(Andrew). *Yiddish:* Mark David, *Zeneise:* Angelo Pavese.
*Tools Used to Create This Web Page:*
The UTF8-aware Kermit 95 <k95.html> terminal emulator on Windows, to
a Unix host with the EMACS <http://www.gnu.org/directory/emacs.html>
text editor. Kermit 95 displays UTF-8 and also allows keyboard entry
of arbitrary Unicode BMP characters as 4 hex digits, as shown HERE
<glass.html>. Hex codes for Unicode values can be found in The
Unicode Standard <http://www.unicode.org/unicode/uni2book/u2.html>
(recommended) and the online code charts
<http://www.unicode.org/charts/>. When submissions arrive by email
encoded in some other character set (Latin-1, Latin-2, KOI, various
PC code pages, JEUC, etc), I use the TRANSLATE command of C-Kermit
<ckermit.html> on the Unix host (where I read my mail <safe.html>)
to convert the character set to UTF-8 (I could also use Kermit 95
for this; it has the same TRANSLATE command). That's it -- no "Web
authoring" tools, no locales, no "smart" anything. It's just plain
text, nothing more. By the way, there's nothing special about EMACS
-- any text editor will do, providing it allows entry of arbitrary
8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1
actually supports UTF-8; earlier versions don't know about it and
display the octal codes; either way is OK for this purpose.
*Commentary:*
Date: Wed, 27 Feb 2002 13:21:59 +0100
From: "Bruno DEDOMINICIS" <hide@address.com>
Subject: Je peux manger du verre, cela ne me fait pas mal.
I just found out your website and it makes me feel like proposing an
interpretation of the choice of this peculiar phrase.
Glass is transparent and can hurt as everyone knows. The relation
between people and civilisations is sometimes effusional and more
often rude. The concept of breaking frontiers through globalization,
in a way, is also an attempt to deny any difference. Isn't
"transparency" the flag of modernity? Nothing should be hidden any
more, authority is obsolete, and the new powers are supposed to
reign through loving and smiling and no more through coercion...
Eating glass without pain sounds like a very nice metaphor of this
attempt. That is, frontiers should become glass transparent first,
and be denied by incorporating them. On the reverse, it shows that
through globalization, frontiers undergo a process of displacement,
that is, when they are not any more speakable, they become repressed
from the speech and are therefore incorporated and might become
painful symptoms, as for example what happens when one tries to eat
glass.
The frontiers that used to separate bodies one from another tend to
divide bodies from within and make them suffer.... The chosen phrase
then appears as a denial of the symptom that might result from the
destitution of traditional frontiers.
Best,
Bruno De Dominicis, Paris, France
*Other Unicode pages onsite:*
* Peace in All Languages <http://www.columbia.edu/~fdc/pace/>
* Frank's Compulsive Guide to Postal Addresses <postal.html>
(especially the Index <postal.html#index>)
* Representing Middle English on the Web with UTF-8 <st-erkenwald.html>
* The Kermit Bibliography <biblio.html> (in UTF-8)
* Interchange of Non-English Computer Text <accents.html> (UTF-8
math and box-drawing)
* Unicode Table <utf8-t1.html> (in UTF-8)
*Unicode samplers offsite:*
* Michael Everson's Bibliography of Typography and Scripts
<http://www.evertype.com/scriptbib.html>
* Sample Unicode Test Pages and Script Links
<http://home.att.net/~jameskass/scriptlinks.htm>
* I don't know, I only work here <http://crism.maden.org/dunno.html>
* Anyone can be provincial!
<http://www.trigeminal.com/samples/provincial.html>
* Transcriptions of "Unicode"
<http://www.macchiato.com/unicode/Unicode_transcriptions.html>
* Example Unicode Usage for Business Applications
<http://www.i18nguy.com/unicode-example.html>
* UTF-8 and Unicode FAQ for Unix/Linux
<http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps>
*Unicode fonts:*
* Unicode Fonts for Windows Computers
<http://www.alanwood.net/unicode/fonts.html> (Alan Wood)
* Unicode Fonts and Tools for X11
<http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html> (Markus Kuhn)
* Everson Mono <http://www.evertype.com/emono/> (Michael Everson)
* Agfa Monotype <http://www.monotype.com>
[ Kermit 95 <k95.html> ] [ K95 Screen Shots <glass.html> ] [ C-Kermit
<ckermit.html> ] [ Kermit Home <index.html> ] [ Display Problems?
<http://www.unicode.org/help/display_problems.html> ] [ The Unicode
Consortium <http://www.unicode.org> ]
------------------------------------------------------------------------
UTF-8 Sampler / The Kermit Project <index.html> / Columbia University
<http://www.columbia.edu> / hide@address.com
<mailto:hide@address.com>