Python: .json data from splunk

Question

Following problem: For my university project I uploaded a json file to splunk and now I want to use this in python as a dataframe object.

Code:

import urllib3
import requests
import json
import pandas as pd

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)


server = 'localhost'
port = 8089
username = 'testuid'
password = 'testpw'
url='https://'+ server +':' + str(port)
param = {'shortname', 'permissionId'}

search='?search=search source%3D%22events.json%22%20host%3D%22DESKTOP-9QDQ0FT%22%20index%3D%22projektseminar%22%20sourcetype%3D%22_json%22%20%7Chead%2020%20%7Ctable%20shortname%20permissionId'

output_type =  '&output_mode=json'
search_url = url + '/servicesNS/nobody/search/search/jobs/export' + search + output_type

r = requests.get(search_url, auth=(username, password), verify=False)

Works well to this point. Now I want this specific "r" response object as an dataframe object with the 2 columns "shortname" and "permissionId". There are several problems I have with this. First of all the json I get from the Rest API is with the columns "preview", "offset" and "results". I want a dataframe with the columns "shortname" and "permissionId". The problem is I can't use things like json.load(r) or r.json() or similiar, there always comes "Extra Data" Error. So I'm a beginner with splunk and python so maybe there is a better way to do so... Another idea I didn't tried yet is to use a csv output instead of json. Would be nice if you guys would've some suggestion how to solve this problem.

thx

Answer 1

The best way to accomplish this is to use the Splunk API for Python

You can find the SDK here: https://github.com/splunk/splunk-sdk-python

import sys
import os
from time import sleep
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "splunk-sdk-python-1.6.13"))

import splunklib.client as client
import splunklib.results as results

import pandas as pd

# Note my port, username, and password are specific to my instance. The default port is 8089

service = client.connect(host='localhost', port=MY_PORT,
                   username='MY_USER', password='MY_PASS')

search = """search index=_internal sourcetype="splunkd_access" |table *"""

job = service.jobs.create(search)
while True:
    while not job.is_ready():
        pass
    if job['isDone'] == '1':
        break
    sleep(2)


reader = results.ResultsReader(job.results())


df = pd.DataFrame(reader)
print(df)

Python: .json data from splunk

Question

1 answers

solution1
1 2020-06-11 22:56:10

Python: .json data from splunk

Question

1 answers

solution1 1 2020-06-11 22:56:10

solution1
1 2020-06-11 22:56:10