简体   繁体   中英

Parsing XML files to LogStash

I have the following logstash conf file:

input {  
file 
{
    path => "C:\Dashboard\Elmah\*.xml"
    start_position => "beginning"
    type => "error"
    codec => multiline 
    {
        pattern => "^<\?error .*\>"
        negate => true
        what => "previous"
    }
    sincedb_path => "C:\Dashboard\Elmah"
  }
}

filter 
{
    xml 
    {
        source => "error"
        xpath => 
        [
            "/error/@errorId", "ErrorId",
            "/error/@type", "Type",
            "/error/@message", "Message",
            "/error/@time", "Time",
            "/error/@user", "User"
        ]
        store_xml => true
    }
}

output 
{
    elasticsearch 
    { 
        action => "index"
        host => "localhost"
        index => "stock"
        workers => 1
    }
    stdout 
    {
        codec => rubydebug
    }
}

When I run bin/logstash -f agent.conf I do not get an error but no data gets inserted into Elasticsearch. An example of the file is: https://www.dropbox.com/s/6oni2zhorsdtz6p/error-2015-06-26203423Z-3026bd43-07d6-44d6-a6cf-6d27b28a607e.xml?dl=0

How do I get Logstash to read in a collection of external xml files?

LogStash Debug Output:

Please see here: https://www.dropbox.com/s/g7g1154uvf9fr1f/outputlog2.txt?dl=0

I'm not sure you can use the file input here -- I've only seen it used to monitor files for changes, not to monitor for new files. Unless your XML files are updated, I don't think it will do anything. Remember that logstash is watching for new log lines typically.

Most people write tools like the following to process whole files in batch:

https://github.com/elastic/elasticsearch-river-wikipedia

https://github.com/andrewvc/wikiparse

https://github.com/elastic/stream2es

Those tools, especially the last one, seem much closer to your use case.

I have managed to process files containing one xml document on each line using the following logstash configuration. Hope this help!

 input { file { path => "/tmp/logstash/test.log" start_position => "beginning" sincedb_path => "/dev/null" } } filter { xml { source => "message" force_array => false xpath => [ "/Event/@timestamp", "time", "/Event/user[1]/id[1]/text()", "user", "/Event/user[1]/ip[1]/text()[1]", "ip", "/Event/@eventType", "eventType", "/Event/transactionDuration/text()", "trxDuration", ] store_xml => true } } output { stdout{ codec => line { format => "%{[time]} %{[user]} %{[eventType]} %{[trxDuration]}" } } } 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM