尝试使用正则表达式解析日志文件

Question

I am trying to parse a log file with a regex and I understand the first of pulling out the IP addressed but I am stuck on how to move beyond the rest of it for the log file. 我正在尝试使用正则表达式解析日志文件，并且我了解提取IP地址的第一步，但是我仍然坚持如何将日志文件扩展到其余部分。 So to start parsing the rest do I just tack on the regex to parse out the date and etc? 因此，要开始解析其余部分，我是否只需要使用正则表达式来解析日期等？ So i would 2nd element to be the second ip of 72.37.100.86. 所以我将第二个元素设为72.37.100.86的第二个IP。 Then I would like to exclude the "- - -" and have the date be the 4th element along with "GET / HTTP/1.1:" to be the 8th index along with the status code of 200 to be the 9th index. 然后，我想排除“---”，并将日期和“ GET / HTTP / 1.1：”一起作为第4个元素，将第8个索引与状态代码200一起作为第9个索引。 Any help with this would be much appreciated in understanding what I need to do next. 在理解我下一步需要做的工作中，对此的任何帮助将不胜感激。

package com.text.nginx_log_parser;

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExTester {


// Actual Entry : 10.10.100.151 - 72.37.100.86, 192.36.20.508 - - - [04/Jul/2016:12:50:06 +0000]  https https https "GET / HTTP/1.1" 200 20027 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.107 Safari/537.36"
public static String logEntry = "10.10.100.151 - 72.37.100.86, 192.36.20.508 - - - [04/Jul/2016:12:50:06 +0000]  https https https \"GET / HTTP/1.1\" 200 20027 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.107 Safari/537.36\"\r\n";

//public static String regex = "(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})";
//public static String regex = "(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})";
public static void main (String [] args){

    String regex = "(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\s*-*\\s*-*\\s*-*";
    regexChecker(regex, logEntry);
    regex = "\\[*\\]\\s.";
    regexChecker(regex, logEntry);
}

public static void regexChecker(String regex, String str){

    Pattern pattern = Pattern.compile(regex);

    Matcher matcher = pattern.matcher(logEntry);
    //String firstIP = matcher.group(0);
    //String secondIP = matcher.group();
    //String timestamp = 
    while(matcher.find()){
        System.out.println( matcher.group(0));
    }
  }
}

Answer 1

With the following regex: 使用以下正则表达式：

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})[-\s]+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).+?\[(.+?)\].*?\"(.+?)\"\s(\d{3}).*$

you are looking at capture groups 1 through 5 as per this entry on regex101.com 您是通过5按看着捕捉组1 此项上regex101.com

尝试使用正则表达式解析日志文件

问题描述

1 个解决方案

解决方案1
1 2017-06-05 13:13:27

尝试使用正则表达式解析日志文件

问题描述

1 个解决方案

解决方案1 1 2017-06-05 13:13:27

解决方案1
1 2017-06-05 13:13:27