简体   繁体   English

如何使用ELK堆栈将大JSON文件导入到Docker-swarm集群中?

[英]How to import a big JSON-file to a Docker-swarm cluster with ELK stack?

Basically I want to import JSON-data into (Logstash->ElasticSearch->) Kibana, but I'm completely new and stuck at the different methods, which I do not fully understand and get errors or no output. 基本上,我想将JSON数据导入到(Logstash-> ElasticSearch->)Kibana中,但是我是一个全新的人,并且陷入了不同的方法,我无法完全理解并得到错误或没有输出。

What I've got is a file test.json containing Wikipedia-data in this format: 我得到的是一个文件test.json,其中包含以下格式的Wikipedia-data:

{
"results": [
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 43,
        "prev": "other-external",
        "type": "external"
    },
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 1569,
        "prev": "other-search",
        "type": "external"
    },
    {
        "curr": "Ohio_\"Heartbeat_Bill\"",
        "n": 11,
        "prev": "other-internal",
        "type": "external"
    },
...

And so on. 等等。 The file is 1.3Mb big, because I've deleted some of the largest examples. 该文件大1.3Mb,因为我删除了一些最大的示例。

I tried the curl command: 我尝试了curl命令:

cat test.json | 猫test.json | jq -c '.[] | jq -c'。[] | {"index": {}}, .' {“ index”:{}},'。 | | curl -XPOST localhost:9200/_bulk --data-binary @- curl -XPOST本地主机:9200 / _bulk --data-binary @-

and

curl -s -XPOST localhost:9200/_bulk --data-binary @test.json curl -s -XPOST本地主机:9200 / _bulk --data-binary @ test.json

and

write "{ "index" : { } }" at the beginning of the document 在文档开头写“ {“ index”:{}}”

I also tried: 我也尝试过:

curl -XPUT http://localhost:9200/wiki -d '
{
  "mappings" : {
    "_default_" : {
      "properties" : {
        "curr" : {"type": "string"},
        "n" : {"type": "integer"},
        "prev" : {"type": "string"},
        "type" : {"type": "string"}
      }
    }
  }
}
';

But I always get this error: 但是我总是会收到这个错误:

{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406} {“错误”:“不支持Content-Type标头[application / x-www-form-urlencoded]”,“状态”:406}

Or when I use: 或当我使用时:

curl localhost:9200/wiki -H "Content-type:application/json" -X POST -d @test.json curl本地主机:9200 / wiki -H“内容类型:应用程序/ json” -X POST -d @ test.json

I get: 我得到:

{"error":"Incorrect HTTP method for uri [/wiki] and method [POST], allowed: [GET, HEAD, DELETE, PUT]","status":405} {“错误”:“ uri [/ wiki]和方法[POST]的HTTP方法不正确,允许:[GET,HEAD,DELETE,PUT]”,“状态”:405}

And when I replace "wiki" with "_bulk", like all the examples seem to have in common, then I get: 当我将“ wiki”替换为“ _bulk”时,就像所有示例似乎都具有共同点一样,那么我得到:

{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\\"security\\" charset=\\"UTF-8\\""}}],"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\\"security\\" charset=\\"UTF-8\\""}},"status":401 {“错误”:{“ root_cause”:[{“类型”:“ security_exception”,“原因”:“缺少REST请求的认证令牌[/ _bulk]”,“标题”:{“ WWW-Authenticate”:“基本realm = \\“ security \\” charset = \\“ UTF-8 \\”“}}],” type“:” security_exception“,” reason“:”缺少REST请求[/ _bulk]的身份验证令牌“,” header“: {“ WWW-Authenticate”:“基本领域= \\”安全\\“字符集= \\” UTF-8 \\“”}},“状态”:401

I also have copy-pasted-and-adjusted-as-far-as-I-understood the conf-file in Kibana-Logstash-Pipeline like this: 我也尽可能地复制粘贴并调整了我在Kibana-Logstash-Pipeline中的conf文件,如下所示:

input 
{
    file 
    {
        codec => multiline
        {
            pattern=> '^\{'
            negate=> true
            what=> previous
        }
        path => ["/home/user/docker-elastic/examples/pretty.json"]
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
    }
}

filter 
{
    mutate
    {
        replace => [ "message", "%{message}}" ]
        gsub => [ 'message','\n','']
    }
    if [message] =~ /^{.*}$/ 
    {
        json { source => message }
    }
}

output
{ 
  elasticsearch {
        protocol => "http"
        codec => json
        host => "localhost"
        index => "wiki_json"
        embedded => true
    }

    stdout { codec => rubydebug }
}

But when I click "create and deploy" nothing happens. 但是,当我单击“创建并部署”时,什么也没有发生。

So I have tried some examples, but like I said - I don't fully understand them and have therefore trouble getting my data to Kibana. 因此,我尝试了一些示例,但是就像我说的那样-我不完全理解它们,因此很难将数据发送到Kibana。 I've written Logstash and ElasticSearch, because I would love to pass the data by using those, too. 我已经编写了Logstash和ElasticSearch,因为我也想通过使用它们来传递数据。

Can somebody please explain to me, how I can pass this data directly, without manually altering the file? 有人可以向我解释一下,如何在不手动更改文件的情况下直接传递此数据吗? Many answers said that the data cannot be passed in the structure I have but must be "one line, one input"-only. 许多回答说,数据无法在我拥有的结构中传递,而只能是“一行,一个输入”。 But I cannot alter the whole file with nearly 40000 data by hand and I would like not have to write a python-script for it.. 但是我不能用手更改将近40000个数据的整个文件,而且我不想为此编写一个python-script。

Maybe there is a tool or something? 也许有什么工具? Or maybe I'm just too stupid to understand the syntax and am doing something wrong? 或者,也许我太愚蠢,无法理解语法并且做错了什么?

Any help is appreciated! 任何帮助表示赞赏! Thank you in advance! 先感谢您!

Like @Ian Kemp answered in the comment section, the Problem was that I used POST and not PUT. 就像@Ian Kemp在评论部分回答的一样,问题是我使用POST而不是PUT。 After that I got an error saying that authentification failed, so I googled for it and got the final answer: 之后,我收到一条错误消息,提示身份验证失败,因此我用谷歌搜索并得到了最终答案:

curl elastic:changeme@localhost:9200/wiki -H "Content-type: application/json" -X PUT -d @test.json curl elastic:changeme @ localhost:9200 / wiki -H“内容类型:application / json” -X PUT -d @ test.json

with the index line in the file. 与文件中的索引行。 This is the structure of how I finally got the data to be in Elasticsearch :) THANK YOU very much Ian Kemp! 这就是我最终如何在Elasticsearch中获得数据的结构:)非常感谢Ian Kemp!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM