简体   繁体   中英

Extracting values as a dictionary from dataframe based on list

I have a dataframe with unique value in each columns:

df1 = pd.DataFrame([["Phys","Shane","NY"],["Chem","Mark","LA"],
                    ["Maths","Jack","Mum"],["Bio","Sam","CT"]],
                    columns = ["cls1","cls2","cls3"])
print(df1)

    cls1    cls2    cls3
0   Phys    Shane   NY
1   Chem    Mark    LA
2   Maths   Jack    Mum
3   Bio     Sam     CT

And a list l1:

l1=["Maths","Bio","Shane","Mark"]
print(l1)

['Maths', 'Bio', 'Shane', 'Mark']

Now I want to retrieve a columns from dataframe that contains elements from list and list of elements.

Expected Output :

{'cls1' : ['Maths','Bio'], 'cls2': ['Shane','Mark']}

The code I have :

cls = []
for cols in df1.columns:
    mask = df1[cols].isin(l1)
    if mask.any():
        cls.append(cols)
print(cls)

The output of above code :

['cls1', 'cls2']

I'm struggling to get common elements from dataframe and list to convert it into dictionary.

Any suggestions are welcome.

Thanks.

Use DataFrame.isin for mask, replace non match values by indexing and reshape with stack :

df = df1[df1.isin(l1)].stack()
print (df)
0  cls2    Shane
1  cls2     Mark
2  cls1    Maths
3  cls1      Bio
dtype: object

Last create list by dict comprehension :

d = {k:v.tolist() for k,v in df.groupby(level=1)}
print(d)
{'cls2': ['Shane', 'Mark'], 'cls1': ['Maths', 'Bio']}

Another solution:

d = {}
for cols in df1.columns:
    mask = df1[cols].isin(l1)
    if mask.any():
        d[cols] = df1.loc[mask, cols].tolist()

print(d)
{'cls2': ['Shane', 'Mark'], 'cls1': ['Maths', 'Bio']}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM