简体   繁体   中英

Logstash Trouble parsing Json

TL:DR - My valid JSON logs are rejected by Logstash with the complaint that the JSON is not valid because of some escape characters. Unfortunately I'm not able to figure out what the issue is.

Full version:

My Logstash(+Logstash-Forwarder) picks up Apache logs from a custom log format which is configured like this:

LogFormat "{ \
    \"@timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
    \"@version\": \"1\", \
    \"vhost\":\"%V\", \
    \"tags\":[\"apache-json\"], \
    \"message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \
    \"clientip\": \"%a\", \
    \"duration\": %D, \
    \"status\": %>s, \
    \"request\": \"%U%q\", \
    \"urlpath\": \"%U\", \
    \"urlquery\": \"%q\", \
    \"bytes\": %B, \
    \"method\": \"%m\", \
    \"referer\": \"%{Referer}i\", \
    \"useragent\": \"%{User-agent}i\" \
}" ls_apache_json

The related Logstash input/filter configuration is quite straight forward:

input {
    lumberjack {
        port => 5000
        type => "logs"
    }
}
filter {
    if [type] =~ /-json$/ {
        json {
            source => "message"
        }
    }
}

And still it seems that some logs can't be parsed as JSON - I get a lot of these errors:

{
   :timestamp=>"2015-08-27T12:47:05.165000+0200",
   :message=>"Trouble parsing json",
   :source=>"message",
   :raw=>"{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\"    }",
   :exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120) at [Source: [B@29a15d70; line: 1, column: 231]>
   :level=>:warn
}

I double checked the raw messages with a simple Ruby snippet and this is able to parse the JSON:

# irb
$ require 'json'
$ s= "{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\"    }"
$ JSON.parse(s)
=> {"@timestamp"=>"2015-08-27T12:47:02+0200", "@version"=>"1", "vhost"=>"www.example.org", "tags"=>["apache-json"], "clientip"=>"127.0.0.1", "duration"=>1280, "status"=>200, "request"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlpath"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlquery"=>"", "bytes"=>2913, "method"=>"GET", "referer"=>"http://www.example.org/file.html", "useragent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"}

I'm running Logstash 1.5.2. I assume this is somewhat Codec-related by my attempts to set the codec parameter for the json parser didn't stop the problem. Is there anything which Apache needs in addition to make sure it uses the right codec - I can't seem to find any configuration option :(

Any help is welcome.

#replace " and \
ruby {
 code => 'str=event["request_body"];   str=str.gsub("\\x22","\"").gsub("\\x5C", "\\"); event["request_body"]=str;'
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM