While making persistent API calls, I am looping over a large list in order to reorganize my data and save it to a file, like so:
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
# in actual code, api calls happen here, processing genre, artist and track
data['genre']= genre
data['artist'] = artist
data['track'] = track
# use 'a' -append mode
with open('data.json', mode='a') as f:
f.write(json.dumps([data], indent=4))
NOTE : Since I have a window of one hour to make api calls (after which token expires), I must save data to disk on the fly, inside the for loop
.
The method above appends data to data.json
file, but my dumped lists are not comma separated and file ends up being populated like so:
[
{
"genre": "Alternative",
"artist": "Radiohead",
"album": "Ok computer"
}
]
[
{
"genre": "Eletronic",
"artist": "Kraftwerk",
"album": "Computer World"
}
]
So, how can I dump my data ending up with a list of lists separated by commas?
One approach is to read the JSON file before writing to it.
Ex:
import json
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
data['genre']= genre
data['artist'] = artist
data['track'] = track
# Read JSON
with open('data.json', mode='r') as f:
fileData = json.load(f)
fileData.append(data)
with open('data.json', mode='w') as f:
f.write(json.dumps(fileData, indent=4))
Something like this would work
import json
music = [['Alternative', 'Radiohead', 'Ok computer'], ['Eletronic', 'Kraftwerk', 'Computer World']]
output = list()
for item in music:
data = dict()
genre = item[0]
artist= item[1]
track= item[2]
data['genre']= genre
data['artist'] = artist
data['track'] = track
output.append(data)
with open('data.json', mode='a') as f:
f.write(json.dumps(output, indent=4))
My data.json contains:
[
{
"genre": "Alternative",
"track": "Ok computer",
"artist": "Radiohead"
},
{
"genre": "Eletronic",
"track": "Computer World",
"artist": "Kraftwerk"
}
]
For large datasets, pandas
(for serializing) and pickle
(for saving) work together like a charm.
df = pd.DataFrame()
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
# in actual code, api calls happen here, processing genre, artist and track
data['genre']= genre
data['artist'] = artist
data['track'] = track
df = df.append(data, ignore_index=True)
df.to_pickle('data.pkl')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.