简体   繁体   中英

Extract values from Dictionary inside a pandas dataframe (Python)

trying to extract the dictionary in a dataframe. but unable to. none of the solution mentioned matches my requirement hence seeking help for the same.

    instrument_token  last_price      change                                                                                          depth
0           17600770      180.75   20.500000  {'buy': [{'quantity': 1, 'price': 1, 'orders': 1},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 1, 'price': 1, 'orders': 1},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
1           12615426        0.05  -50.000000  {'buy': [{'quantity': 2, 'price': 2, 'orders': 2},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 2, 'price': 2, 'orders': 2},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
2           17543682        0.35  -89.062500  {'buy': [{'quantity': 3, 'price': 3, 'orders': 3},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 3, 'price': 3, 'orders': 3},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
3           17565954        6.75  -10.000000  {'buy': [{'quantity': 4, 'price': 4, 'orders': 4},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 4, 'price': 4, 'orders': 4},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
4           26077954        3.95  -14.130435  {'buy': [{'quantity': 5, 'price': 5, 'orders': 5},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 5, 'price': 5, 'orders': 5},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
5           17599490      141.75   -2.241379  {'buy': [{'quantity': 6, 'price': 6, 'orders': 6},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 6, 'price': 6, 'orders': 6},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
6           17566978       17.65   -1.671309  {'buy': [{'quantity': 7, 'price': 7, 'orders': 7},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 7, 'price': 7, 'orders': 7},{'quantity': 0, 'price': 0.0, 'orders': 0}]}
7          26075906       24.70  -16.554054  {'buy': [{'quantity': 8, 'price': 8, 'orders': 8},{'quantity': 0, 'price': 0.0, 'orders': 0}], 'sell': [{'quantity': 8, 'price': 8, 'orders': 8},{'quantity': 0, 'price': 0.0, 'orders': 0}]}

looking to convert to the following:

    instrument_token  last_price      change    buy_price    sell_price
0           17600770      180.75   20.500000       1              1
1           12615426        0.05  -50.000000       2              2
2           17543682        0.35  -89.062500       3              3
3           17565954        6.75  -10.000000       4              4
4           26077954        3.95  -14.130435       5              5  
5           17599490      141.75   -2.241379       6              6
6           17566978       17.65   -1.671309       7              7
...

able to access the individual elements using a for loop by unable to convert the dictionary to the desired df.col as shown in the above desired df.

You want to get price only from the first element of the list, and not a sum, then do:

df["buy_price"]=df["depth"].str["buy"].str[0].str["price"]
df["sell_price"]=df["depth"].str["sell"].str[0].str["price"]

In case you wish to get a sum of all nested elements:

df["buy_price"]=df["depth"].str["buy"].apply(lambda x: sum(el["price"] for el in x))
df["sell_price"]=df["depth"].str["sell"].apply(lambda x: sum(el["price"] for el in x))

I use ast here to get it into Python data structure from string. For actual dictionaries, as is your case, you can remove the ast.literal_eval part out of the script.

Get the dictionary and merge back to original dataframe. Assumption, based on your output is that you are only interested in the first dict in each sublist for buy and sell respectively.

import ast
res = [{f"{x}_price" : ast.literal_eval(ent)[x][0]['price'] 
        for x in ("buy","sell")} 
        for ent in df.pop('depth') ]

df.join(pd.DataFrame(res))

    instrument_token    last_price  change     buy_price    sell_price
0   17600770            180.75      20.500000       1          1
1   12615426            0.05       -50.000000       2          2
2   17543682            0.35       -89.062500       3          3
3   17565954            6.75       -10.000000       4          4
4   26077954            3.95       -14.130435       5          5
5   17599490            141.75     -2.241379        6          6
6   17566978            17.65      -1.671309        7          7
7   26075906            24.70      -16.554054       8          8

For actual dictionaries:

res = [{f"{x}_price" : ent[x][0]['price'] 
        for x in ("buy","sell")} 
        for ent in df.pop('depth') ]

#merge back to df
result = df.join(pd.DataFrame(res))

Is this what you're looking for?

def get_prices(depth, tag):
    def sum(items):
        total = 0
        for item in items:
            total += item['price']
        return total
    return int(sum(depth[tag]))

df['buy_price'] = df['depth'].apply(lambda depth: get_prices(depth, 'buy'))
df['sell_price'] = df['depth'].apply(lambda depth: get_prices(depth, 'sell'))
df.drop(columns='depth', inplace=True)
print(df)

Output:

instrument_token  last_price     change  buy_price  sell_price
0          17600770      180.75  20.500000          1           1
1          12615426        0.05 -50.000000          2           2
2          17543682        0.35 -89.062500          3           3
3          17565954        6.75 -10.000000          4           4
4          26077954        3.95 -14.130435          5           5
5          17599490      141.75  -2.241379          6           6
6          17566978       17.65  -1.671309          7           7
7          26075906       24.70 -16.554054          8           8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM