简体   繁体   English

Python: .json 数据来自 splunk

[英]Python: .json data from splunk

Following problem: For my university project I uploaded a json file to splunk and now I want to use this in python as a dataframe object.以下问题:对于我的大学项目,我将 json 文件上传到 splunk,现在我想在 python 中将其用作 dataframe2668CFDE63911DCB49.

Code:代码:

import urllib3
import requests
import json
import pandas as pd

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)


server = 'localhost'
port = 8089
username = 'testuid'
password = 'testpw'
url='https://'+ server +':' + str(port)
param = {'shortname', 'permissionId'}

search='?search=search source%3D%22events.json%22%20host%3D%22DESKTOP-9QDQ0FT%22%20index%3D%22projektseminar%22%20sourcetype%3D%22_json%22%20%7Chead%2020%20%7Ctable%20shortname%20permissionId'

output_type =  '&output_mode=json'
search_url = url + '/servicesNS/nobody/search/search/jobs/export' + search + output_type

r = requests.get(search_url, auth=(username, password), verify=False)

Works well to this point.到目前为止,效果很好。 Now I want this specific "r" response object as an dataframe object with the 2 columns "shortname" and "permissionId".现在我想要这个特定的“r”响应 object 作为 dataframe object 与 2 列“shortname”和“permissionId”。 There are several problems I have with this.我有几个问题。 First of all the json I get from the Rest API is with the columns "preview", "offset" and "results".首先,我从 Rest API 获得的 json 带有“预览”、“偏移”和“结果”列。 I want a dataframe with the columns "shortname" and "permissionId".我想要一个带有“shortname”和“permissionId”列的 dataframe。 The problem is I can't use things like json.load(r) or r.json() or similiar, there always comes "Extra Data" Error.问题是我不能使用json.load(r)r.json()或类似的东西,总是出现“额外数据”错误。 So I'm a beginner with splunk and python so maybe there is a better way to do so... Another idea I didn't tried yet is to use a csv output instead of json. So I'm a beginner with splunk and python so maybe there is a better way to do so... Another idea I didn't tried yet is to use a csv output instead of json. Would be nice if you guys would've some suggestion how to solve this problem.如果你们对如何解决这个问题有一些建议,那就太好了。

thx谢谢

The best way to accomplish this is to use the Splunk API for Python实现此目的的最佳方法是使用 Splunk API for Python

You can find the SDK here: https://github.com/splunk/splunk-sdk-python您可以在此处找到 SDK: https://github.com/splunk/splunk-sdk-python

import sys
import os
from time import sleep
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "splunk-sdk-python-1.6.13"))

import splunklib.client as client
import splunklib.results as results

import pandas as pd

# Note my port, username, and password are specific to my instance. The default port is 8089

service = client.connect(host='localhost', port=MY_PORT,
                   username='MY_USER', password='MY_PASS')

search = """search index=_internal sourcetype="splunkd_access" |table *"""

job = service.jobs.create(search)
while True:
    while not job.is_ready():
        pass
    if job['isDone'] == '1':
        break
    sleep(2)


reader = results.ResultsReader(job.results())


df = pd.DataFrame(reader)
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM