[英]Regular Expression for string in java
I am trying to write a regular expression for these find of strings 我正在尝试为这些字符串查找编写正则表达式
05 IMA-POLICY-ID PIC X(15). 00020068
05 (AMENT)-GROUPCD PIC X(10).
I want to parse anything between 05 and first tab . 我想解析05和第一个标签之间的任何内容。 The line might start with tabs or spaces and then digit Initial number can be anything 05,10,15 . 该行可能以制表符或空格开头,然后数字初始数字可以是任何05,10,15。
So In the first line I need to pasrse IMA-POLICY-ID
and in second line (AMENT)-GROUPCD
所以在第一行我需要传递IMA-POLICY-ID
和第二行(AMENT)-GROUPCD
This is the code i have written and its not finding the pattern where am i going wrong ? 这是我写的代码,它没有找到我错误的模式?
Pattern p1 = Pattern.compile("^[0-9]+\\s\\S+\t$");
Matcher m1 = p1.matcher(line);
System.out.println("m1 =="+m1.group());
Pattern p1 = Pattern.compile("\\b(?:05|1[05])\\b[^\\t]*\\t");
will match anything from 05
, 10
or 15
until the nearest \\t
. 将匹配任何从05
, 10
或15
,直到最近的\\t
。
Explanation: 说明:
\b # Start of number/word
(?:05|1[05]) # Match 05, 10 or 15
\b # End of number/word
[^\t]* # Match any number of characters except tab
\t # Match a tab
Your pattern expects the line to end after IMA-POLICY-ID
etc, because of the $
at the end. 您的模式期望该行在IMA-POLICY-ID
等之后结束,因为最后是$
。
If there is no white space in the string you want to match (I assume there isn't because of your use of \\S+
, I'd change the pattern to ^\\d+\\s+(\\S+)
which should be sufficient to match any number at the start of a line, followed by whitespace and then the group of non-whitespace characters you want to match (note that a tab is whitespace as well). 如果你想要匹配的字符串中没有空格(我假设没有因为你使用\\S+
,我会将模式更改为^\\d+\\s+(\\S+)
,这应该足以匹配一行开头的任何数字,后跟空格,然后是你想要匹配的非空白字符组(请注意,标签也是空格)。
If you need to match until the first tab or the end of the input and include other whitespace, replace (\\S+)
with ([^\\t]+)
. 如果需要匹配到第一个选项卡或输入的结尾并包含其他空格,请用([^\\t]+)
替换(\\S+)
([^\\t]+)
。
^\d+\s+([^\s]+)
this will match your requirement 这符合您的要求
demo here : http://regex101.com/r/rQ7fT3 这里演示: http : //regex101.com/r/rQ7fT3
I can see two things that might prevent your Pattern
from working. 我可以看到两件可能阻止你的Pattern
工作的东西。
Strings
contain multiple tab-separated values, therefore the $
"end-of-input" character at the end of your Pattern
will fail to match the String
首先你输入Strings
包含多个制表符分隔值,因此$
“结束输入”字在你的最终Pattern
将无法匹配String
05
(etc.) and the 1st tab. 其次,你想找到05
(等)和第一个标签之间的内容。 Therefore you need to wrap your desired expression between parenthesis (eg (\\\\S+)
) and refer it by its group number (in this case, it would be group 1
) 因此,您需要在括号之间包含所需的表达式(例如(\\\\S+)
)并通过其组号引用它(在这种情况下,它将是组1
) Here's an example: 这是一个例子:
String input = "05 IMA-POLICY-ID\tPIC X(15).\t00020068" +
"\r\n05 (AMENT)-GROUPCD\tPIC X(10).";
// | 0, 1, or 5 twice (refine here if needed)
// | | 1 whitespace
// | | | your queried expression (here I use a
// | | | reluctant dot search
// | | | | tab
// | | | | | anything after, reluctant
Pattern p = Pattern.compile("[015]{2}\\s(.+?)\t.+?");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println("Found: " + m.group(1));
}
Output 产量
Found: IMA-POLICY-ID
Found: (AMENT)-GROUPCD
Your regex is almost correct. 你的正则表达式几乎是正确的。 Just remove the \\t$
at the end of your regex. 只需删除正则表达式末尾的\\t$
。 and capture the \\\\S+
as a group. 并将\\\\S+
作为一组捕获。
Pattern p1 = Pattern.compile("^[0-9]+\\s(\\S+)");
Now print it as: 现在将其打印为:
if (m.find( )) {
System.out.println(m.group(1));
}
This is what i came up with and it worked : 这就是我提出的并且它有效:
String re = "^\\s+\\d+\\s+([^\\s]+)";
Pattern p1 = Pattern.compile(re, Pattern.MULTILINE);
Matcher m1 = p1.matcher(line);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.