简体   繁体   中英

Converting from half-width katakana to full-width katakana

I need to convert halfwidth katakana characters to fullwidth characters.

Example:

  • String "カタカナ" (U+FF76 U+FF80 U+FF76 U+FF85)
  • Convert to "カタカナ" (U+30AB U+30BF U+30AB U+30CA)

How can I do this in Java?

I think you are talking about converting "half-width Katakana" Unicode code-points to their regular equivalents.

See here for info, including a listing of the relevant code point values:

I don't know if there is a recommended way to do it (eg a standard API or 3rd-party library), but you could easily write some code to:

  • put the text to a StringBuilder
  • iterate the character positions in the builder
    • fetch the character from the builder
    • identify the half-width katakana characters (by doing a range check)
    • map the to full-width equivalents (by adding an offset ... or using a Map )
    • update character in the builder
  • Use the contents of the updated builder; eg turn it into a String .

You could try the icu4j library. It supports a number of transliterations , including Halfwidth-Fullwidth and Fullwidth-Halfwidth .

Example:

Transliterator t = Transliterator.getInstance("Halfwidth-Fullwidth");
String result = t.transliterate(original);

Use just standard API (java.text.Normalizer). No need 3rd party solution or dependency.

It converts full-width alphanumeric characters to half-width and half-width katakana to full-width katakana

Normalizer.normalize(text, Normalizer.Form.NFKC)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM