[英]Regex for commas and periods allowed
I tried searching for an answer to this question and also reading the Regex Wiki but I couldn't find what I'm looking for exactly. 我尝试搜索该问题的答案,还阅读了Regex Wiki,但找不到确切的内容。
I have a program that validates a document. 我有一个验证文档的程序。 (It was written by someone else).
(它是别人写的)。
If certain lines or characters don't match the regex then an error is generated. 如果某些行或字符与正则表达式不匹配,则会生成错误。 I've noted that a few false errors are always generated and I want to correct this.
我注意到,总是会产生一些错误的错误,我想更正此错误。 I believe I have narrowed down the problem to this:
我相信我已将问题缩小为:
Here is an example: 这是一个例子:
This error is flagged by the program logic: 该错误由程序逻辑标记:
ERROR: File header immediate origin name is invalid: CITIBANK, N.A.
Here is the code that causes that error: 这是导致该错误的代码:
if(strLine.substring(63,86).matches("[A-Z,a-z,0-9, ]+")){
}else{
JOptionPane.showMessageDialog(null, "ERROR: File header immediate origin name is invalid: "+strLine.substring(63,86));
errorFound=true;
fileHeaderErrorFound=true;
bw.write("ERROR: File header immediate origin name is invalid: "+strLine.substring(63,86));
bw.newLine();
I believe the reason that the error is called at runtime is because the text contains a period and comma.. I am unsure how to allow these in the regex. 我相信在运行时调用错误的原因是因为文本包含句点和逗号。.我不确定如何在正则表达式中允许它们。
I have tried using this 我试过使用这个
if(strLine.substring(63,86).matches("[A-Z,a-z,0-9,,,. ]+")){
and it seemed to work I just wanted to make sure that is the correct way because it doesn't look right. 而且似乎可行,我只是想确保这是正确的方法,因为它看起来不正确。
You're right in your analysis, the match failed because there was a dot in the text that isn't contained in the character class . 您的分析正确,匹配失败,因为字符类中没有包含一个点。
However, you can simplify the regex - no need to repeat the commas, they don't have any special meaning inside a class: 但是,您可以简化正则表达式-无需重复逗号,它们在类中没有任何特殊含义:
if(strLine.substring(63,86).matches("[A-Za-z0-9,. ]+"))
Are you sure that you'll never have to match non-ASCII letters or any other kind of punctuation, though? 您确定您将永远不必匹配非ASCII字母或任何其他类型的标点符号吗?
Alphabets and digits : a-zA-Z0-9 can effectively be replaced by \\w denoting 'words'. 字母和数字:a-zA-Z0-9可以有效地由表示单词的\\ w代替。 The period and comma don't need escaping and can be used as is.
句号和逗号不需要转义,可以原样使用。 Hence this regex might come in handy:
因此,此正则表达式可能派上用场:
"[\w,.]"
Hope this helps. 希望这可以帮助。 :)
:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.