[英]Why does this use, in Java, of regular expressions throw an “Unclosed character class” exception at runtime?
I have list of keywords: 我有关键字列表:
String[] keywords = {"xxxx", "yyyy", "zzzz"};
String[] another = {"aaa", "bbb", "ccc"};
I am trying to identify text that has one of the keywords followed by a space and then followed by one of the "another" words. 我正在尝试识别文本,该文本的关键词之一是一个空格,然后是一个“另一个”单词。
if I use: 如果我使用:
Pattern pattern = Pattern.compile(keywords+"\\s"+another);
This throws an exception at runtime: 这会在运行时引发异常:
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 57
[Ljava.lang.String;@3dd4ab05\s[Ljava.lang.String;@5527f4f9
^
How can I fix this? 我怎样才能解决这个问题?
That error is correctly telling you that the pattern you're trying to create is invalid. 该错误正确地告诉您您要创建的模式无效。 The gibberish looking string starting with
[Ljava
is the string you passed to Pattern.compile()
. 以
[Ljava
开头的乱七八糟的字符串是您传递给Pattern.compile()
的字符串。
Java Arrays unfortunately do not have very informative .toString()
output, and what you're doing here is essentially concatenating two arrays as Strings, which Pattern
cannot hope to parse correctly. 不幸的是,Java数组没有非常有用的
.toString()
输出,您在这里所做的实际上是将两个数组串联为字符串, Pattern
无法希望正确解析。
But even if you called Arrays.toString()
, you'd still not get what you're looking for: 但是,即使您调用了
Arrays.toString()
,也仍然无法获得所需的内容:
Pattern pattern=Pattern.compile(Arrays.toString(keywords)+"\\s"+
Arrays.toString(another));
System.out.println(pattern.pattern());
[xxxx, yyyy, zzzz]\\s[aaa, bbb, ccc]
This is a technically valid, but essentially meaningless regular expression, which will only match three-character Strings starting with one character from xyz ,
followed by one whitespace character, followed by one character from abc ,
. 这是一种技术上有效的,但从本质上讲毫无意义的正则表达式,它将仅匹配三个字符的字符串,这些字符串以
xyz ,
一个字符开头xyz ,
然后是一个空白字符,然后是abc ,
一个字符。
I would suggest reading more about how regular expressions work; 我建议阅读更多有关正则表达式如何工作的信息。 there's lots of resources online to help, and a good starting point is the Java Regular Expressions lesson , and the Pattern documentation - you won't get very far until you understand what regular expressions are trying to do.
在线上有很多资源可以提供帮助,而Java正则表达式课程和Pattern文档是一个很好的起点-在您了解正则表达式要做什么之前,您不会走得太远。
As a starting point however, a regular expression that matches one of several words, followed by a space, followed by one of several other words, might look like this: 但是,作为起点,匹配几个单词之一,后跟一个空格,然后是几个其他单词之一的正则表达式可能看起来像这样:
(?:xxxx|yyyy|zzzz)\s(?:aaa|bbb|ccc)
This uses "non-capturing groups" and the logical OR operator |
这使用“非捕获组”和逻辑OR运算符
|
to specify multiple potential matches. 指定多个潜在的匹配项。
[Ljava.lang.String;@3dd4ab05
is the result of calling toString()
on a string array. [Ljava.lang.String;@3dd4ab05
是在字符串数组上调用toString()
的结果。
You need to build your pattern manually with the items that are in the relevant arrays. 您需要使用相关数组中的项目手动构建模式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.