简体   繁体   English

正则表达式解析 CustomLog 格式 (PHP)

[英]Regex to parse CustomLog format (PHP)

I am trying to parse a CustomLog format in this format:我正在尝试以这种格式解析 CustomLog 格式:

LogFormat "%v %{X-Forwarded-For}i %h %l %u %t \"%r\" %>s %b" MyCustomLog

This is how the entry looks - note that there is a comma delimiting the IP's passed in the X-Forwarded-For header.这是条目的外观 - 请注意,有一个逗号分隔在 X-Forwarded-For 标头中传递的 IP。

my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351

I want to capture the following fields:我想捕获以下字段:

  • x-forward-for IP's (comma delimited) x-forward-for IP(逗号分隔)
  • remote hostname远程主机名
  • remote logname (may be -)远程登录名(可能是 -)
  • remote user (may be -)远程用户(可能是 -)
  • timestamp in [ ] block [ ] 块中的时间戳
  • the request url (in the quotes)请求网址(在引号中)
  • the response size (the last value)响应大小(最后一个值)

I am a bit rusty with regex - at least in the sense of negative lookaheads which is what i think i need to use?我对正则表达式有点生疏 - 至少在我认为我需要使用的负面前瞻的意义上?

Help is appreciated!帮助表示赞赏!

This is a more complete pattern that should work for you.这是一个更完整的模式,应该适合你。 I break everything out as part of a group more completely and even added names for the groups.我更完整地将所有内容分解为一个组的一部分,甚至为这些组添加了名称。 It matches both the example found in your question and the one in the comments.它与您的问题中的示例和评论中的示例相匹配。

Demo: https://3v4l.org/jMKFL演示: https : //3v4l.org/jMKFL

<?php
$pattern = '/(?P<hostname>[\w\.]+) '
         . '(?P<forwardedFor>(?:[\d\.]+, )*(?:[\d\.]+)|-) '
         . '(?P<remoteHostname>[\d\.]+) '
         . '(?P<remoteLogname>[^\s]+) '
         . '(?P<remoteUsername>[^\s]+) '
         . '\['
            . '(?P<requestDate>[^\]]+)'
         . '\] '
         . '"'
            . '(?P<method>\w+) '
            . '(?P<uri>[^\s]+) '
            . '(?<httpVersion>[^\"]+)'
         . '" '
         . '(?P<responseStatus>\d+) '
         . '(?P<responseSize>\d+)/';

$test = 'my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351';
$test2 = 'qa-test.test.com - 80.82.65.120 - - [18/May/2016:00:30:20 -0400] "GET // HTTP/1.1" 404 198';

preg_match($pattern, $test, $matches);
print_r($matches);

preg_match($pattern, $test2, $matches);
print_r($matches);

Outputs:输出:

Array
(
    [0] => my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351
    [hostname] => my.server.com
    [1] => my.server.com
    [forwardedFor] => 24.24.24.3, 1.2.3.4
    [2] => 24.24.24.3, 1.2.3.4
    [remoteHostname] => 1.2.3.5
    [3] => 1.2.3.5
    [remoteLogname] => -
    [4] => -
    [remoteUsername] => -
    [5] => -
    [requestDate] => 18/May/2016:02:57:25 -0400
    [6] => 18/May/2016:02:57:25 -0400
    [method] => GET
    [7] => GET
    [uri] => /veer/eye?params=1&are=2&right=3&here=4
    [8] => /veer/eye?params=1&are=2&right=3&here=4
    [httpVersion] => HTTP/1.1
    [9] => HTTP/1.1
    [responseStatus] => 200
    [10] => 200
    [responseSize] => 146351
    [11] => 146351
)
Array
(
    [0] => test.test.com - 80.82.65.120 - - [18/May/2016:00:30:20 -0400] "GET // HTTP/1.1" 404 198
    [hostname] => test.test.com
    [1] => test.test.com
    [forwardedFor] => -
    [2] => -
    [remoteHostname] => 80.82.65.120
    [3] => 80.82.65.120
    [remoteLogname] => -
    [4] => -
    [remoteUsername] => -
    [5] => -
    [requestDate] => 18/May/2016:00:30:20 -0400
    [6] => 18/May/2016:00:30:20 -0400
    [method] => GET
    [7] => GET
    [uri] => //
    [8] => //
    [httpVersion] => HTTP/1.1
    [9] => HTTP/1.1
    [responseStatus] => 404
    [10] => 404
    [responseSize] => 198
    [11] => 198
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM