简体   繁体   中英

Python: .json data from splunk

Following problem: For my university project I uploaded a json file to splunk and now I want to use this in python as a dataframe object.

Code:

import urllib3
import requests
import json
import pandas as pd

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)


server = 'localhost'
port = 8089
username = 'testuid'
password = 'testpw'
url='https://'+ server +':' + str(port)
param = {'shortname', 'permissionId'}

search='?search=search source%3D%22events.json%22%20host%3D%22DESKTOP-9QDQ0FT%22%20index%3D%22projektseminar%22%20sourcetype%3D%22_json%22%20%7Chead%2020%20%7Ctable%20shortname%20permissionId'

output_type =  '&output_mode=json'
search_url = url + '/servicesNS/nobody/search/search/jobs/export' + search + output_type

r = requests.get(search_url, auth=(username, password), verify=False)

Works well to this point. Now I want this specific "r" response object as an dataframe object with the 2 columns "shortname" and "permissionId". There are several problems I have with this. First of all the json I get from the Rest API is with the columns "preview", "offset" and "results". I want a dataframe with the columns "shortname" and "permissionId". The problem is I can't use things like json.load(r) or r.json() or similiar, there always comes "Extra Data" Error. So I'm a beginner with splunk and python so maybe there is a better way to do so... Another idea I didn't tried yet is to use a csv output instead of json. Would be nice if you guys would've some suggestion how to solve this problem.

thx

The best way to accomplish this is to use the Splunk API for Python

You can find the SDK here: https://github.com/splunk/splunk-sdk-python

import sys
import os
from time import sleep
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "splunk-sdk-python-1.6.13"))

import splunklib.client as client
import splunklib.results as results

import pandas as pd

# Note my port, username, and password are specific to my instance. The default port is 8089

service = client.connect(host='localhost', port=MY_PORT,
                   username='MY_USER', password='MY_PASS')

search = """search index=_internal sourcetype="splunkd_access" |table *"""

job = service.jobs.create(search)
while True:
    while not job.is_ready():
        pass
    if job['isDone'] == '1':
        break
    sleep(2)


reader = results.ResultsReader(job.results())


df = pd.DataFrame(reader)
print(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM