Introduction to the Japanese Character Set

From Technical Presentations

Jump to: navigation, search
Introduction to the Japanese Character Set
Presenter(s): Yoshinori Matsunobu, Senior Consultant, MySQL AB
Where: MySQL Conference and Expo 2007 (description)
When: April 26, 2007
Topics: MySQL, Internationalization, Unicode
Download: PDF

Contents

[edit] Description

Prsentation about challenges developers face when implementing support for Japanese language and problems with Unicode support for Japanese character sets in MySQL.

[edit] Japanese character sets

  • JIS X 0208
  • Vendor Defined Kanji
    • NEC Kanji
    • IBM Kanji

[edit] Many encodings

  • Shift_JIS (sjis, cp932)
  • EUC-JP (ujis, eucjpms)
  • Unicode (utf8)

[edit] Issues with MySQL support

  • 4-Byte UTF-8 support is needed
    • Some Japanese characters are not covered by UCS-2.
  • Shift_JIS is dangerous, but widely used
    • 0x5C problem
    • Widely used for historical reasons
Personal tools