[英]How do I combine three text column in pandas using if & else
I'm using python 3.7 I have a pandas data frame with three text columns,name,email & section.我正在使用 python 3.7 我有一个 pandas 数据框,其中包含三个文本列、名称、email 和部分。 The sample data looks like
样本数据看起来像
Name Email Section
abc abc@gmail.com purchase
cde - drawing
lmn-pqr None -
Hyphen are there in between two words in all of the three columns.在所有三列中的两个单词之间都有连字符。 I would like to join three columns with "_" as separator and create a new column group ignoring None or -.
我想用“_”作为分隔符加入三列,并创建一个忽略无或 - 的新列组。 My combined outcome will look like
我的综合结果看起来像
Name Email Section Group
abc abc@gmail.com purchase abc_abc@gmail.com_purchase
cde - drawing cde_drawing
lmn-pqr None - lmn-pqr
I'm not sure about the python code.我不确定 python 代码。 Can you please help me?
你能帮我么?
You can use str.cat
that gets rid of null values:您可以使用
str.cat
去除 null 值:
df.mask(df.isin(['-', None])).apply(lambda r: r.str.cat(sep='_'), axis=1)
or, manually:或者,手动:
df['Group'] = df.apply(lambda r: '_'.join([x for x in r.replace('-', pd.NA).dropna()]),
axis=1)
output: output:
Name Email Section Group
0 abc abc@gmail.com purchase abc_abc@gmail.com_purchase
1 cde - drawing cde_drawing
2 lmn-pqr None - lmn-pqr
You can try replace -
with None
then filter it out when join您可以尝试将
-
替换为None
然后在加入时将其过滤掉
df['Group'] = df.replace({'-': None}).apply(lambda row: '_'.join(filter(None, row)), axis=1)
print(df)
Name Email Section Group
0 abc abc@gmail.com purchase abc_abc@gmail.com_purchase
1 cde - drawing cde_drawing
2 lmn-pqr None - lmn-pqr
df['Group'] = df.apply(lambda x: '-'.join([x['Name'], x['Email'], x['Section']))
x is a Series. x 是一个系列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.