简体   繁体   English

Python根据条件追加到dataframe列

[英]Python append to dataframe column based on condition

I want to count the number of pipe symbol occurrence in a column of a data frame and it equals 5, then I need to append another pipe(|) symbol to the existing value. 我想计算数据帧的一列中管道符号出现的数量,它等于5,所以我需要在现有值后面附加另一个管道(|)符号。

df2['smartexpenseid']

0      878497|253919815?HOTEL?141791520780|||305117||
1                                   362593||||35068||
2         |231931871509?CARRT?231940968972||||177849|
3       955304|248973233?HOTEL?154687992630||||93191|
4                                 27984||||5883|3242|
5    3579321|253872763?HOTEL?128891721799|92832814|||
6            127299|248541768?HOTEL?270593355555|||||
7         |231931871509?CARRT?231940968972||||177849|
8                                   831665||||80658||
9              |247132692?HOTEL?141790728905||||6249|

For ex: for row number 5, the (|) count is 5, so it should add another (|) to the existing value and for other rows, since count is 6, we just leave it as it is. 例如:对于第5行,(|)计数为5,因此应在现有值上添加另一个(|),对于其他行,由于count为6,我们将其保持原样。 Can somebody help me with this ? 有人可以帮我吗?

I tried these 我尝试了这些

if df2['smartexpenseid'].str.count('\|')==5:
    df2['smartexpenseid'].append('\|')

This is throwing me error saying "The truth value of a Series is ambiguous" 这使我犯了一个错误,说“系列的真值不明确”

and also 并且

a = df2['smartexpenseid'].str.count('\|')
if 5 in a:
    a.index(5)

So you have the vectorized str methods down. 因此,您可以使用向量化的str方法 Now you need to conditionally append an extra '|' 现在,您需要有条件地附加一个额外的'|' character. 字符。 See Pandas section on masking for more info. 有关更多信息,请参见遮罩的熊猫部分

m = df2['smartexpenseid'].str.count('\|') == 5
df2.loc[m, 'smartexpenseid'] = df2['smartexpenseid'][m].values + '|'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM