简体   繁体   English

检查Java字符串是否包含Unicode字符

[英]Check if java string contains unicode character

I'm trying to check to see if a string contains a specific unicode point from the Segoe MDL2 Assets font. 我试图检查字符串是否包含来自Segoe MDL2 Assets字体的特定unicode点。

An example of a unicode value that I want to check for is 我要检查的unicode值的示例是

\uF14B

Here's where I'm grabbing my values from 这是我从中获取价值的地方

https://docs.microsoft.com/en-us/windows/uwp/design/style/segoe-ui-symbol-font https://docs.microsoft.com/zh-cn/windows/uwp/design/style/segoe-ui-symbol-font

How exactly can I check a string to see if it contains one of these values? 如何准确地检查字符串以查看其中是否包含这些值之一?

I have tried 我努力了

        if (buttons[i].getText().contains("\uF14B")) {

            buttons[i].setFont(new Font("Segoe MDL2 Assets", Font.PLAIN, 15 )); 
        }

While this does work, I think that it's pretty ineffecient to have to copy and paste each and every value that I plan to use into a if statement. 尽管这确实可行,但我认为必须将我计划使用的每个值都复制并粘贴到if语句中,这效率很低。

Is there an easier way to do this? 有没有更简单的方法可以做到这一点?

Edit: 编辑:

I ended up placing a ~ after each special character in my array, and parsed it like this. 我最终在数组中的每个特殊字符后放置一个〜,然后像这样解析它。 Are there any issues in doing this? 这样做有什么问题吗?

/** Creating the names of the buttons. */
String [] buttonNames = {

        "Lsh", "Rsh", "Or", "Xor", "Not","And",
        "\uE752~", "Mod", "CE", "C", "\uF149~", "\uE94A~",
        "A", "B", "\uF14D~", "\uF14E~", "\uE94F~", "\uE947~",
        "C", "D", "\uF14A~", "\uF14B~", "\uF14C~", "\uE949~",
        "E", "F", "\uF14A~", "\uF14B~", "\uF14C~", "\uE948~",
        "(", ")", "\uE94D~", "0", ".", "\uE94E~" 
        };

/** more code here */

if (buttons[i].getText().contains("~")) {

                buttons[i].setFont(new Font("Segoe MDL2 Assets", Font.PLAIN, 15 )); 
                buttons[i].setText(buttons[i].getText().substring(0, buttons[i].getText().lastIndexOf('~')));
            }

You can invert the font selection logic: 您可以反转字体选择逻辑:

The Font class has goodies like canDisplay and canDisplayUpTo . Font类具有诸如canDisplaycanDisplayUpTo之类的好东西。 Javadoc: Javadoc:

public int canDisplayUpTo​(String str)

Indicates whether or not this Font can display a specified String. 指示此字体是否可以显示指定的字符串。 For strings with Unicode encoding, it is important to know if a particular font can display the string. 对于使用Unicode编码的字符串,了解特定字体是否可以显示该字符串非常重要。 This method returns an offset into the String str which is the first character this Font cannot display without using the missing glyph code. 此方法将偏移量返回给String str,这是该Font不使用缺少的字形代码就无法显示的第一个字符。 If the Font can display all characters, -1 is returned. 如果字体可以显示所有字符,则返回-1。

The best / easiest way to scan text to find certain characters is to use a regular expression character class. 扫描文本以查找某些字符的最佳/最简单方法是使用正则表达式字符类。

A character class is written as [xxx] where xxx can be set of single characters, eg a or \ , and/or ranges, eg az or \-\ . 字符类写为[xxx] ,其中xxx可以设置单个字符,例如a\ ,和/或范围,例如az\-\

So, you can write a regex like this: 因此,您可以编写这样的正则表达式:

[\uE700-\uE72E\uE730\uE731\uE734\uE735\uE737-\uE756]

and so on (that was about 10% of the code points list on the linked page ) . 依此类推(大约是链接页面上代码点列表的10%)

The above can also be done using exclusion, ie 以上也可以使用排除来完成,即

[\uE700-\uE756&&[^\uE72F\uE732\uE733\uE736]]

where the [^xxx] means "not any of these characters". [^xxx]意思是“这些字符中的任何一个都不是”。

You then compile it and use it to check strings: 然后,您编译它并使用它检查字符串:

String regex = "[\uE700-\uE72E\uE730\uE731\uE734\uE735\uE737-\uE756]";
Pattern p = Pattern.compile(regex);

if (p.matcher(buttons[i].getText()).find()) {

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM