简体   繁体   中英

How to import a big JSON-file to a Docker-swarm cluster with ELK stack?

Basically I want to import JSON-data into (Logstash->ElasticSearch->) Kibana, but I'm completely new and stuck at the different methods, which I do not fully understand and get errors or no output.

What I've got is a file test.json containing Wikipedia-data in this format:

{
"results": [
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 43,
        "prev": "other-external",
        "type": "external"
    },
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 1569,
        "prev": "other-search",
        "type": "external"
    },
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 11,
        "prev": "other-internal",
        "type": "external"
    },
...

And so on. The file is 1.3Mb big, because I've deleted some of the largest examples.

I tried the curl command:

cat test.json | jq -c '.[] | {"index": {}}, .' | curl -XPOST localhost:9200/_bulk --data-binary @-

and

curl -s -XPOST localhost:9200/_bulk --data-binary @test.json

and

write "{ "index" : { } }" at the beginning of the document

I also tried:

curl -XPUT http://localhost:9200/wiki -d '
{
  "mappings" : {
    "_default_" : {
      "properties" : {
        "curr" : {"type": "string"},
        "n" : {"type": "integer"},
        "prev" : {"type": "string"},
        "type" : {"type": "string"}
      }
    }
  }
}
';

But I always get this error:

{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}

Or when I use:

curl localhost:9200/wiki -H "Content-type:application/json" -X POST -d @test.json

I get:

{"error":"Incorrect HTTP method for uri [/wiki] and method [POST], allowed: [GET, HEAD, DELETE, PUT]","status":405}

And when I replace "wiki" with "_bulk", like all the examples seem to have in common, then I get:

{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\\"security\\" charset=\\"UTF-8\\""}}],"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\\"security\\" charset=\\"UTF-8\\""}},"status":401

I also have copy-pasted-and-adjusted-as-far-as-I-understood the conf-file in Kibana-Logstash-Pipeline like this:

input 
{
    file 
    {
        codec => multiline
        {
            pattern=> '^\{'
            negate=> true
            what=> previous
        }
        path => ["/home/user/docker-elastic/examples/pretty.json"]
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
    }
}

filter 
{
    mutate
    {
        replace => [ "message", "%{message}}" ]
        gsub => [ 'message','\n','']
    }
    if [message] =~ /^{.*}$/ 
    {
        json { source => message }
    }
}

output
{ 
  elasticsearch {
        protocol => "http"
        codec => json
        host => "localhost"
        index => "wiki_json"
        embedded => true
    }

    stdout { codec => rubydebug }
}

But when I click "create and deploy" nothing happens.

So I have tried some examples, but like I said - I don't fully understand them and have therefore trouble getting my data to Kibana. I've written Logstash and ElasticSearch, because I would love to pass the data by using those, too.

Can somebody please explain to me, how I can pass this data directly, without manually altering the file? Many answers said that the data cannot be passed in the structure I have but must be "one line, one input"-only. But I cannot alter the whole file with nearly 40000 data by hand and I would like not have to write a python-script for it..

Maybe there is a tool or something? Or maybe I'm just too stupid to understand the syntax and am doing something wrong?

Any help is appreciated! Thank you in advance!

Like @Ian Kemp answered in the comment section, the Problem was that I used POST and not PUT. After that I got an error saying that authentification failed, so I googled for it and got the final answer:

curl elastic:changeme@localhost:9200/wiki -H "Content-type: application/json" -X PUT -d @test.json

with the index line in the file. This is the structure of how I finally got the data to be in Elasticsearch :) THANK YOU very much Ian Kemp!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM