簡體   English   中英

Json通過API到Elasticsearch

[英]Json to Elasticsearch via API

我正在嘗試向Elasticsearch添加一個json文件,該文件大約有30.000行,並且格式不正確。 我正在嘗試通過Bulk API上傳它,但找不到有效格式化它的方法。 我正在使用Ubuntu 16.04LTS。

這是json的格式:

{
    "rt": "2018-11-20T12:57:32.292Z",
    "source_info": { "ip": "0.0.60.50" },
    "end": "2018-11-20T12:57:32.284Z",
    "severity": "low",
    "duid": "5b8d0a48ba59941314e8a97f",
    "dhost": "004678",
    "endpoint_type": "computer",
    "endpoint_id": "8e7e2806-eaee-9436-6ab5-078361576290",
    "suser": "Katerina",
    "group": "PERIPHERALS",
    "customer_id": "a263f4c8-942f-d4f4-5938-7c37013c03be",
    "type": "Event::Endpoint::Device::AlertedOnly",
    "id": "83d63d48-f040-2485-49b9-b4ff2ac4fad4",
    "name": "Peripheral allowed: Samsung Galaxy S7 edge"
}

我確實知道Bulk API的格式需要{"index":{"_id":*}}在文件中的每個json對象之前,如下所示:

{"index":{"_id":1}}

{
    "rt": "2018-11-20T12:57:32.292Z",
    "source_info": { "ip": "0.0.60.50" },
    "end": "2018-11-20T12:57:32.284Z",
    "severity": "low",
    "duid": "5b8d0a48ba59941314e8a97f",
    "dhost": "004678",
    "endpoint_type": "computer",
    "endpoint_id": "8e7e2806-eaee-9436-6ab5-078361576290",
    "suser": "Katerina",
    "group": "PERIPHERALS",
    "customer_id": "a263f4c8-942f-d4f4-5938-7c37013c03be",
    "type": "Event::Endpoint::Device::AlertedOnly",
    "id": "83d63d48-f040-2485-49b9-b4ff2ac4fad4",
    "name": "Peripheral allowed: Samsung Galaxy S7 edge"
}

如果我手動插入索引ID,然后使用此表達式curl -s -H“ Content-Type: application/x-ndjson" -XPOST localhost:92100/ivc/default/bulk?pretty --data-binary @results.json它將沒有錯誤地上傳。

我的問題是,如何將索引id {"index":{"_id":*}}到json的每一行,以使其准備上載? 顯然,索引ID必須在每行上添加+1,是否可以通過CLI進行?

抱歉,如果該帖子看起來不正確,我在Stack Overflow中閱讀了數百萬篇帖子,但這是我的第一篇! #Desperate

提前非常感謝您!

您的問題是Elasticsearch希望文檔在ONE行上是有效的json,如下所示:

{"index":{"_id":1}}
{"rt":"2018-11-20T12:57:32.292Z","source_info":{"ip":"0.0.60.50"},"end":"2018-11-20T12:57:32.284Z","severity":"low","duid":"5b8d0a48ba59941314e8a97f","dhost":"004678","endpoint_type":"computer","endpoint_id":"8e7e2806-eaee-9436-6ab5-078361576290","suser":"Katerina","group":"PERIPHERALS","customer_id":"a263f4c8-942f-d4f4-5938-7c37013c03be","type":"Event::Endpoint::Device::AlertedOnly","id":"83d63d48-f040-2485-49b9-b4ff2ac4fad4","name":"Peripheral allowed: Samsung Galaxy S7 edge"}

您必須找到一種轉換輸入文件的方法,以便每行有一個文檔,然后采用Val的解決方案就可以了。

感謝您提供的所有答案,它們確實幫助我朝正確的方向前進。

我制作了一個bash腳本來自動化日志的下載,格式化和上載到Elasticsearch:

#!/bin/bash

echo "Downloading logs from Sophos Central. Please wait."

cd /home/user/ELK/Sophos-Central-SIEM-Integration/log

#This deletes the last batch of results
rm result.json
cd .. 

#This triggers the script to download a new batch of logs from Sophos

./siem.py
cd /home/user/ELK/Sophos-Central-SIEM-Integration/log

#Adds newline at the beginning of the logs file
sed -i '1 i\{"index":{}}' result.json

#Adds indexes
sed -i '3~2s/^/{"index":{}}/' result.json

#Adds json file to elasticsearch 
curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/ivc/default/_bulk?pretty --data-binary @result.json

這就是我實現這一目標的方式。 可能會有更簡單的選擇,但是這個對我有用。 希望對其他人有用!

再次感謝大家! :d

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM