Unable to export particular data from a .json file from a website

Question

I'm using the following to parse data from a website:

import requests
import pandas as pd

resp = requests.get("https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=1000000&page=1").json()
df = pd.DataFrame(resp['posts'], columns=['episodeNumber','slug','image','excerpt','audioSource'])    
df.to_csv("output9.csv", encoding='utf-8', index='false')

data = pd.read_csv("output9.csv")

As you can see, I've had to pull the entire 'excerpt' column which pulls all three instead of just one. How would I go about just pulling say the 'short' one? What is the heading called instead of 'column'? Also, the 'title' doesn't seem to be under any sort of header - how would I pull this too?

A quick visual of the .json is here if it helps: https://www.dropbox.com/s/v9l81ber6i4nbgw/11111111.jpg?dl=0

Any help would be greatly appreciated.

Answer 1

The workaround which I can think of is to normalizes the resp['posts'] json and dont mention the columns. Below is the code to generate the above dataframe:

    import requests
    import pandas as pd
    from pandas.io.json import json_normalize

    resp = requests.get("https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=1000000&page=1").json()
    # print(resp['posts'][0])
    df = pd.DataFrame(json_normalize(resp['posts']))
    df.to_csv("output2_9.csv", encoding='utf-8', index='false')

Now once you have this dataframe u can filter which ever column you want it has all the field of json and column names as : audioSource content date episodeNumber excerpt.full excerpt.long excerpt.short id image.full image.large image.medium image.thumb musicCredits next next.slug next.title permalink prev prev.slug prev.title slug title

The title header is also present in this dataframe

Answer 2

I've taken the excerpt series, called the apply function and took the 'short' series which was created from apply . You might have to handle the extra double quotes, consider the following code:

import requests
import pandas as pd

resp = requests.get("https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=1000000&page=1").json()
df = pd.DataFrame(resp['posts'], columns=['episodeNumber','slug','image','excerpt','audioSource'])    
df['excerpt'] = df['excerpt'].apply(pd.Series)['short']#.replace({'"': '\'','""': '\'','"""': '\'' }, regex=True)
df.to_csv("output9.csv", encoding='utf-8', index='false')
data = pd.read_csv("output9.csv")

Unable to export particular data from a .json file from a website

Question

2 answers

solution1
2 2019-06-17 04:18:55

solution2
1 ACCPTED 2019-06-17 04:51:38

Unable to export particular data from a .json file from a website

Question

2 answers

solution1 2 2019-06-17 04:18:55

solution2 1 ACCPTED 2019-06-17 04:51:38

solution1
2 2019-06-17 04:18:55

solution2
1 ACCPTED 2019-06-17 04:51:38