简体   繁体   中英

How to create new columns in Pandas dataframe with flags if the element in a list existing in another column?

I wonder how to create new columns in Pandas dataframe with flags if the element in a list existing in another column? Thank you so much.

list =['apple', 'banana', 'peach']

Input dataframe: 在此处输入图像描述

Output dataframe: 在此处输入图像描述

Try to explode fruit column into rows of fruit name then pivot your dataframe:

out = df.join(df['fruit'].str.split().explode().reset_index().assign(count=1)
                         .pivot_table('count', 'index', 'fruit', fill_value=0)
                         .add_prefix('flag_'))

Output:

>>> out
                fruit  flag_apple  flag_banana  flag_peach
0        apple banana           1            1           0
1         apple peach           1            0           1
2               peach           0            0           1
3              banana           0            1           0
4               apple           1            0           0
5  apple banana peach           1            1           1

Here's a quick implementation of what I think you're trying to do.

import pandas as pd

fruits = ['apple','banana','peach'] # list of fruit
df = pd.DataFrame(                  # build dataframe
    {'fruit':[
        'apple banana',
        'apple peach',
        'peach',
        'banana',
        'apple',
        'apple banana peach']})

for f in fruits:
    df[f'flag_{f}'] = df['fruit'].str.count(f)
print(df)

Resulting output:

                fruit  flag_apple  flag_banana  flag_peach
0        apple banana           1            1           0
1         apple peach           1            0           1
2               peach           0            0           1
3              banana           0            1           0
4               apple           1            0           0
5  apple banana peach           1            1           1

Here is my attempt:

import pandas as pd


fruits = ['apple','banana','peach']
d = {"fruit" : ["apple banana", "apple peach", "peach","banana", "apple","apple banana peach"]}

df = pd.DataFrame(d)
x=[]
for elem in d['fruit']:
    x.append(elem.split(" "))

for f in fruits:
    df[f'flag_{f}'] = list(map(lambda e: int(f in e), x))
print(df)

I break the strings up into lists first and then check for membership using a lambda to create the new flag columns.

Output:

                fruit  flag_apple  flag_banana  flag_peach
0        apple banana           1            1           0
1         apple peach           1            0           1
2               peach           0            0           1
3              banana           0            1           0
4               apple           1            0           0
5  apple banana peach           1            1           1

Use explode and unstack

(df.assign(f = df['fruit'].str.split())
   .explode('f')
   .assign(v=1)
   .set_index(['fruit','f'])
   .unstack(fill_value=0)
   .droplevel(level=0,axis=1)
   .rename(columns = lambda c : f'flag_{c}')
   .reset_index()
)

output

    fruit                 flag_apple    flag_banana    flag_peach
--  ------------------  ------------  -------------  ------------
 0  apple                          1              0             0
 1  apple banana                   1              1             0
 2  apple banana peach             1              1             1
 3  apple peach                    1              0             1
 4  banana                         0              1             0
 5  peach                          0              0             1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM