[英]Parse a String Line using regular expressions in Java
public static String entryPattern = "^([\\d.]+) (\\S+) (.+?) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\" \"([^\"]+)\"";
public static void parseTwigLine(String line) {
Pattern p = Pattern.compile(entryPattern);
Pattern p1;
Matcher matcher = p.matcher(line);
System.out.println(matcher.groupCount());
if (!matcher.matches() || NUM_FIELDS != matcher.groupCount()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(line);
return;
}
timeStamp = matcher.group(4);
ipAddress = matcher.group(1);
if (!matcher.group(3).equals("-")) {
userName = matcher.group(3);
}
request = matcher.group(5);
response = matcher.group(6);
bytesSent = matcher.group(7);
browser = matcher.group(9);
if (!matcher.group(8).equals("-"))
url = matcher.group(8);
instanceName = url.split("/")[3];
if(request.contains("?q")) {
queryTerms = request.split("[?|&]")[1];
} else if(url.contains("?q")) {
queryTerms = url.split("[?|&]")[1].split("=")[1];
}
if(request.contains("&f")) {
filters = request.split("&f=")[1];
} else if(url.contains("&f")) {
filters = request.split("&f=")[1];
}
}
For this below line my regular expression is not getting matched.. Any suggestions why is it happening. 对于此行,我的正则表达式未得到匹配。任何建议为什么会发生。 As I always get an error as
Bad log entry (or problem with RE?)
from my code above. 由于我总是从上面的代码中看到错误的错误
Bad log entry (or problem with RE?)
。 Anything wrong with my regex 我的正则表达式有什么问题
10.53.32.1 - - [14/Nov/2011:09:45:56 -0800] "GET /host-ui/themes/client/images/preview/left6_na.gif HTTP/1.1" 304 - "http://search.host.com/search-ui/?q=8960" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; MS-RTC LM 8; InfoPath.3; BOIE9;ENUS)"
And for this below line it is getting matched-- 对于下面这行,它越来越匹配了-
10.53.32.1 - - [14/Nov/2011:09:45:56 -0800] "GET /host-ui/themes/client/images/btn_close_include.png HTTP/1.1" 200 1023 "http://search.host.com/search-ui/?q=8960" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; MS-RTC LM 8; InfoPath.3; BOIE9;ENUS)"
The \\d+
doesn't match a -
, replace it with something that does. \\d+
与-
不匹配,请用-
代替它。 Example: 例:
Original: "^([\\d.]+) (\\S+) (.+?) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\" \"([^\"]+)\""
Fixed: "^([\\d.]+) (\\S+) (.+?) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\S+) \"([^\"]+)\" \"([^\"]+)\""
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.