简体   繁体   中英

Extract specific value from JSON column in pandas Dataframe

Firstly i had json that i converted into pandas Dataframe

ad_data = '{
   "data":[
      {
         "impressions":"11111",
         "spend":"123",
         "conversions":[
            {
               "action_type":"start_trial_total",
               "value":"6"
            },
            {
               "action_type":"subscribe_mobile_app",
               "value":"3"
            }
         ],
         "outbound_clicks_ctr":[
            {
               "action_type":"outbound_click",
               "value":"1.869306"
            }
         ],
         "date_start":"2020-01-23",
         "date_stop":"2020-01-23"
      },
      {
         "impressions":"22222",
         "spend":"321",
            {
               "action_type":"start_trial_total",
               "value":"6"
            }
         ],
         "outbound_clicks_ctr":[
            {
               "action_type":"outbound_click",
               "value":"2.328902"
            }
         ],
         "date_start":"2020-01-24",
         "date_stop":"2020-01-24"
      }
   ]
}'

df = pd.DataFrame(ad_data['data'])

So i get Dataframe

impressions spend conversions outbound_clicks_ctr date_start date_stop
11111 123 [{'action_type': 'start_trial_total', 'value': '6'}, {'action_type': 'subscribe_mobile_app', 'value': '3'}] [{'action_type': 'outbound_click', 'value': '1... 2020-01-23 2021-01-23
22222 312 [{'action_type': 'start_trial_total', 'value': '6'}] [{'action_type': 'outbound_click', 'value': '1... 2020-01-24 2020-01-24
... ... ... ... ... ...

And now i want to extract values from column conversions only where subscribe_mobile_app exists, in other case insert 0 and get table like this

impressions spend conversions outbound_clicks_ctr date_start date_stop
11111 123 3 [{'action_type': 'outbound_click', 'value': '1... 2020-01-23 2021-01-23
22222 312 0 [{'action_type': 'outbound_click', 'value': '1... 2020-01-24 2020-01-24
... ... ... ... ... ...

How can i get result like this with pandas?

i also tried to extract values before converting JSON into Dataframe with loop and add it to the list, and then add to Dataframe as new column but this plan didn't work for me too

subscribe = []
for i in ad_data['data']:
    for sub in i['conversions']:
        if sub['action_type'] == 'subscribe_mobile_app':
            subscribe.append(sub['value'])
        else:
            subscribe.append(None)

result was something like this:

[None,3,None, None...]

TRY:

import ast

result = []
for i in df.conversions.values:
    f = False
    for k in ast.literal_eval(i):
        if 'subscribe_mobile_app' in k.values():
            result.append(k['value'])
            f = True
            break
    if not f:
        result.append(0)

df.conversions = result

Since conversions column of your dataframe contains list of dictionaries over which you want to operate. You can create a separate function that can accept these lists as parameter, and then check within the dictionaries in those lists, if you have subscribe_mobile_app in them, and then return the value accordingly:

def subscribe_mobile_app_values(lst):
     val = 0
     for i in lst:
         if i["action_type"] == "subscribe_mobile_app":
             val = i["value"]
             break
     return val

Then just apply this function to the conversions column in your dataframe:

df['conversions'] = df['conversions'].apply(subscribe_mobile_app_values)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM