简体   繁体   中英

Find column names and their respective value in a pandas data frame which matches a condition and store the result in a dictionary

I have a pandas dataframe (called df ) where I search for each row,(ie on a given date) the columns which have values less than 0.5. In the below screenshot, I have highlighted the values (in yellow) which are less than 0.5.

在此处输入图像描述

The df dataframe is as follows:

data = {'Date': ['2020-12-22','2020-12-23','2020-12-24','2020-12-25'],
    'A': ['0.065','0.965','0.363','0.774'],
    'B': ['0.292','0.367','0.396','0.484'],
    'C': ['0.078','0.489','0.095','0.781'],
    'D': ['0.703','0.748','0.631','0.612']}

df = config.pd.DataFrame (data, columns = ['Date','A', 'B', 'C', 'D'])

I would like to store the result in a dictionary which should look similar to the below nested dictionary:

在此处输入图像描述

Could someone help me with a sample code.

Try the following. I understand that the res=... line is not very readable, as most of dict comprehensions are, but it's more concise. If you need a more redable solution, it can be easily done with a couple of loops-ifs

d=df.to_dict(orient='records')

res={i['Date']:{k:float(i[k]) for k in i if k!='Date' and float(i[k])<0.5} for i in d}

>>>print(res)

{'2020-12-22': {'A': 0.065, 'B': 0.292, 'C': 0.078}, '2020-12-23': {'B': 0.367, 'C': 0.489}, '2020-12-24': {'A': 0.363, 'B': 0.396, 'C': 0.095}, '2020-12-25': {'B': 0.484}}

If you want to use loops to construct the result, you can do the following:

d=df.to_dict(orient='records')

for i in d:
    temp={}
    for k in i:
        if k!='Date' and float(i[k])<0.5:
            temp[k]=float(i[k])
    res[i['Date']]=temp

>>>print(res)
    
{'2020-12-22': {'A': 0.065, 'B': 0.292, 'C': 0.078}, '2020-12-23': {'B': 0.367, 'C': 0.489}, '2020-12-24': {'A': 0.363, 'B': 0.396, 'C': 0.095}, '2020-12-25': {'B': 0.484}}

If I understand your correctly, you could use to_dict to generate the results and filter with a dictionary comprehension:

import pprint

# set Date as index
n_df = df.set_index('Date').astype(float)

# use to_dict('index')
res = {k: {ki: vi for ki, vi in d.items() if vi < 0.5} for k, d in n_df.to_dict('index').items()}

pprint.pprint(res)

Output

{'2020-12-22': {'A': 0.065, 'B': 0.292, 'C': 0.078},
 '2020-12-23': {'B': 0.367, 'C': 0.489},
 '2020-12-24': {'A': 0.363, 'B': 0.396, 'C': 0.095},
 '2020-12-25': {'B': 0.484}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM