[英]Replace values with different condition in multiple columns in Pandas
I have a dataframe something like this but much larger: 我有一个类似这样的数据框,但更大:
source next1 next2 next3
b1 {-} b2 -,b2,b3
b2,b3 - {b2,b3} {b2,b3,b4}
Now I need to replace a lot of characters here. 现在我需要在这里替换很多字符。 Every next column should contain values of previous. 每下一列应包含上一列的值。 If the value is -, or {-} that means previous, and if it's not any of that, again, there need to be previous. 如果值是-或{-}表示前一个,如果不是,则也需要前一个。 Desired output: 所需的输出:
source next1 next2 next3
b1 b1 b2 b1,b2,b3
b2,b3 b2,b3 b2,bb3 b2,b3,b4
I have tried something like this: 我已经尝试过这样的事情:
for val in df['source'].values:
if values=b1:
df['next1'].replace('{-},', 'b1,',regex=True, inplace=True)
df['next1'].replace('-,', 'b1,',regex=True, inplace=True)
etc But I have so much rows, and condiditons, so this works to long and not where precise, there are errors. 等等但是我有那么多行和条件,所以这行得很久,而且不够精确,有错误。 Put one value (from replacing) to all rows. 将一个值(替换后)放入所有行。
I don't think there is a fast solution to your question, as string operations will always be slow-ish. 我不认为您的问题有快速的解决方案,因为字符串操作总是很慢。 Still, there is a better/faster one. 不过,有一个更好/更快的方法。
A straight-forward solution would be 一个简单的解决方案是
for i in range(1, df.shape(1)): # here only order matters
df.iloc[:, i].str.replace('{-}', '-', inplace=True)
mask = df.iloc[:, i].str.contains('-')
df.iloc[mask, i].str.replace('{-}', df.iloc[mask, i-1], inplace=True)
with that, it is likely to be WAY faster to have all the columns as sets ({}) and operate on them as such. 这样,将所有列设置为集合({})并对其进行操作可能会更快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.