简体   繁体   English

Pandas - 如果 str.contains 返回多个值,则创建多个新列

[英]Pandas - Create multiple new columns if str.contains return multiple value

I have some data like this:我有一些这样的数据:

0       Very user friendly interface and has 2FA support
1       The trading page is great though with allot o...
2                                         Widget support
3       But it’s really only for serious traders with...
4       The KYC and AML process is painful - it took ...
                             ...                        
937                                      Legit platform!
938     Horrible customer service won’t get back to m...
939                             App is fast and reliable
940               I wish it had a portfolio chart though
941    The app isn’t as user friendly as it need to b...
Name: reviews, Length: 942, dtype: object

and features:和特点:

 ['support',
 'time',
 'follow',
 'submit',
 'ticket',
 'team',
 'swap',
 'account',
 'experi',
 'contact',
 'user',
 'platform',
 'screen',
 'servic',
 'custom',
 'restrict',
 'fast',
 'portfolio',
 'specialist']

I want to check if one or more of features in reviews add that words in new column.我想检查评论中的一项或多项功能是否在新列中添加了该词。

and my code is this:我的代码是这样的:

data["words"] = data[data["reviews"].str.contains('|'.join(features))]

but this code make new column with name "words" however because sometime code return multi value so I get error但是这段代码创建了名称为“words”的新列,但是因为有时代码返回多个值所以我得到错误

ValueError: Columns must be same length as key

how can fix it?怎么解决?

The issue is that you are not actually extracting any of the words.问题是您实际上并没有提取任何单词。 You need to pull the words you want out of the text and then cat them into a new column.你需要从文本中提取你想要的词,然后将它们分类到一个新的列中。

import pandas as pd
from io import StringIO
import re

TESTDATA = StringIO("""Index,reviews,
0,       Very user friendly interface and has 2FA support,
1,       The trading page is great though with allot o...,
2,                                         Widget support,
3,       But it’s really only for serious traders with...,
4,       The KYC and AML process is painful - it took ...,
937,                                      Legit platform!,
938,     Horrible customer service won’t get back to m...,
939,                             App is fast and reliable,
940,               I wish it had a portfolio chart though,
941,    The app isn’t as user friendly as it need to b...
    """)

data = pd.read_csv(TESTDATA, sep=",").drop('Unnamed: 2',   axis = 1)
data
#>    Index                                            reviews
0      0         Very user friendly interface and has 2F...
1      1         The trading page is great though with a...
2      2                                           Widge...
3      3         But it’s really only for serious trader...
4      4         The KYC and AML process is painful - it...
5    937                                        Legit pl...
6    938       Horrible customer service won’t get back ...
7    939                               App is fast and r...
8    940                 I wish it had a portfolio chart...
9    941      The app isn’t as user friendly as it need ...

data['words'] = list(map(lambda x: ", ".join(x), [re.findall('|'.join(features), x) for x in data.reviews]))
data
#>    Index                                            reviews           words
0      0         Very user friendly interface and has 2F...   user, support
1      1         The trading page is great though with a...                
2      2                                           Widge...         support
3      3         But it’s really only for serious trader...                
4      4         The KYC and AML process is painful - it...                
5    937                                        Legit pl...        platform
6    938       Horrible customer service won’t get back ...  custom, servic
7    939                               App is fast and r...            fast
8    940                 I wish it had a portfolio chart...       portfolio
9    941      The app isn’t as user friendly as it need ...            user

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM