[英]Java - Regular Expressions matching one to another
I am trying to retrieve bits of data using RE. 我试图使用RE检索数据位。 Problem is I'm not very fluent with RE.
问题是我对RE不是很流利。 Consider the code.
考虑一下代码。
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class HTTP{
private static String getServer(httpresp){
Pattern p = Pattern.compile("(\bServer)(.*[Server:-\r\n]"); //What RE syntax do I use here?
Matcher m = p.matcher(httpresp);
if (m.find()){
return m.group(2);
public static void main(String[] args){
String testdata = "HTTP/1.1 302 Found\r\nServer: Apache\r\n\r\n"; //Test data
System.out.println(getServer(testdata));
How would I get "Server:" to the next "\\r\\n" out which would output "Apache"? 如何将“Server:”输出到下一个“\\ r \\ n”输出“Apache”? I googled around and tried myself, but have failed.
我用Google搜索并尝试自己,但都失败了。
It's a one liner: 这是一个班轮:
private static String getServer(httpresp) {
return httpresp.replaceAll(".*Server: (.*?)\r\n.*", "$1");
}
The trick here is two-part: 这里的诀窍是两部分:
.*?
.*?
, which is a reluctant match (consumes as little as possible and still match) You could use capturing groups or positive lookbehind. 您可以使用捕获组或积极的lookbehind。
Pattern.compile("(?:\\bServer:\\s*)(.*?)(?=[\r\n]+)");
Then print the group index 1. 然后打印组索引1。
Example: 例:
String testdata = "HTTP/1.1 302 Found\r\nServer: Apache\r\n\r\n";
Matcher matcher = Pattern.compile("(?:\\bServer:\\s*)(.*?)(?=[\r\n]+)").matcher(testdata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
OR 要么
Matcher matcher = Pattern.compile("(?:\\bServer\\b\\S*\\s+)(.*?)(?=[\r\n]+)").matcher(testdata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
Output: 输出:
Apache
Explanation: 说明:
(?:\\\\bServer:\\\\s*)
In regex, non-capturing group would be represented as (?:...)
, which will do matching only. (?:\\\\bServer:\\\\s*)
在正则表达式中,非捕获组将表示为(?:...)
,它将仅进行匹配。 \\b
called word boundary which matches between a word character and a non-word character. \\b
称为单词边界,它在单词字符和非单词字符之间匹配。 Server:
matches the string Server:
and the following zero or more spaces would be matched by \\s*
Server:
匹配字符串Server:
以下零个或多个空格将与\\s*
匹配
(.*?)
In regex (..)
called capturing group which captures those characters which are matched by the pattern present inside the capturing group. (.*?)
在正则表达式(..)
称为捕获组,它捕获与捕获组内部存在的模式匹配的那些字符。 In our case (.*?)
will capture all the characters non-greedily upto, 在我们的情况下
(.*?)
将非贪婪地捕获所有字符,
(?=[\\r\\n]+)
one or more line breaks are detected. (?=[\\r\\n]+)
检测到一个或多个换行符。 (?=...)
called positive lookahead which asserts that the match must be followed by the characters which are matched by the pattern present inside the lookahead. (?=...)
称为正向前瞻,它断言匹配必须跟随前瞻内部存在的模式匹配的字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.