简体   繁体   English

java slugify非英语字符的字符串

[英]java slugify string for non English characters

I need to create slug strings(Human-readable URL slugs from any string) for English and non English characters.. for example Chinese, Japanese, Cyrillic and any other. 我需要为英语和非英语字符创建子弹字符串(任何字符串的人类可读URL子弹),例如中文,日语,西里尔字母和其他字符。

So, each string(for all languages) must be translated in English characters az, 0-9, for example java-slugify-string-for-non-english-characters 因此,每个字符串(对于所有语言)都必须翻译成英文字符az,0-9,例如java-slugify-string-for-non-english-characters

How can I achieve this in Java ? 如何用Java实现呢?

您可以使用用Java编写的Slugify: https//github.com/slugify/slugify

Convert each character into its integer representation, and concatenate: 将每个字符转换为其整数表示形式,然后进行串联:

    String foo = "中国";
    StringBuilder result = new StringBuilder();
    for (int i=0; i<foo.length(); i++) {
        result.append("\\").append((int)foo.charAt(i));
    }
    System.out.println(result);

Produces: 产生:

"\\20013\\22269"

...which is pretty easy to split and convert back to a string. ...这很容易拆分并转换回字符串。 You can also pad the numbers, convert them to hex, and add exclusions so that ASCII/English characters aren't converted, if you'd like. 您也可以填充数字,将其转换为十六进制,然后添加排除项,以便根据需要不转换ASCII /英文字符。 You could also have a look at other, more stardard ways of doing this sort of encoding. 您也可以看看进行这种编码的其他更标准的方法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM