简体   繁体   中英

Pandas - Create multiple new columns if str.contains return multiple value

I have some data like this:

0       Very user friendly interface and has 2FA support
1       The trading page is great though with allot o...
2                                         Widget support
3       But it’s really only for serious traders with...
4       The KYC and AML process is painful - it took ...
                             ...                        
937                                      Legit platform!
938     Horrible customer service won’t get back to m...
939                             App is fast and reliable
940               I wish it had a portfolio chart though
941    The app isn’t as user friendly as it need to b...
Name: reviews, Length: 942, dtype: object

and features:

 ['support',
 'time',
 'follow',
 'submit',
 'ticket',
 'team',
 'swap',
 'account',
 'experi',
 'contact',
 'user',
 'platform',
 'screen',
 'servic',
 'custom',
 'restrict',
 'fast',
 'portfolio',
 'specialist']

I want to check if one or more of features in reviews add that words in new column.

and my code is this:

data["words"] = data[data["reviews"].str.contains('|'.join(features))]

but this code make new column with name "words" however because sometime code return multi value so I get error

ValueError: Columns must be same length as key

how can fix it?

The issue is that you are not actually extracting any of the words. You need to pull the words you want out of the text and then cat them into a new column.

import pandas as pd
from io import StringIO
import re

TESTDATA = StringIO("""Index,reviews,
0,       Very user friendly interface and has 2FA support,
1,       The trading page is great though with allot o...,
2,                                         Widget support,
3,       But it’s really only for serious traders with...,
4,       The KYC and AML process is painful - it took ...,
937,                                      Legit platform!,
938,     Horrible customer service won’t get back to m...,
939,                             App is fast and reliable,
940,               I wish it had a portfolio chart though,
941,    The app isn’t as user friendly as it need to b...
    """)

data = pd.read_csv(TESTDATA, sep=",").drop('Unnamed: 2',   axis = 1)
data
#>    Index                                            reviews
0      0         Very user friendly interface and has 2F...
1      1         The trading page is great though with a...
2      2                                           Widge...
3      3         But it’s really only for serious trader...
4      4         The KYC and AML process is painful - it...
5    937                                        Legit pl...
6    938       Horrible customer service won’t get back ...
7    939                               App is fast and r...
8    940                 I wish it had a portfolio chart...
9    941      The app isn’t as user friendly as it need ...

data['words'] = list(map(lambda x: ", ".join(x), [re.findall('|'.join(features), x) for x in data.reviews]))
data
#>    Index                                            reviews           words
0      0         Very user friendly interface and has 2F...   user, support
1      1         The trading page is great though with a...                
2      2                                           Widge...         support
3      3         But it’s really only for serious trader...                
4      4         The KYC and AML process is painful - it...                
5    937                                        Legit pl...        platform
6    938       Horrible customer service won’t get back ...  custom, servic
7    939                               App is fast and r...            fast
8    940                 I wish it had a portfolio chart...       portfolio
9    941      The app isn’t as user friendly as it need ...            user

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM