I have big json file where I am having multiple urls. The format is like this:
"url": "https://api/v1/test/fhfh"
I want to create csv file out of this which will include only the urls that will start with https://api
How can I do this in most efficient way?
You can try in this way.
json.json
if you have big json with multiple url.
[
{"url": "https://api/v1/test/fhfh1"},
{"url": "https://api/v1/test/fhfh2"},
{"url": "api/v1/test/fhfh"}
]
code
import json
import pandas as pd
with open('json.json', 'r') as f: # read json file
data = json.loads(f.read())
case_list = [] # empty list
length_data = len(data)
n = 0
while n < length_data:
if "https://api" in data[n]["url"]: # if https found then will append to case_list
case_list.append(data[n])
if n == length_data - 1:
break
n +=1
with open('case_list.json', 'w') as outfile: # write updated required json
json.dump(case_list, outfile, indent=2, ensure_ascii=False)
df = pd.read_json ("case_list.json")
df.to_csv ("case_list.csv", index = None) # change json to csv file.
output
print(df)
url
0 https://api/v1/test/fhfh1
1 https://api/v1/test/fhfh2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.