I have a dataframe something like this but much larger:
source next1 next2 next3
b1 {-} b2 -,b2,b3
b2,b3 - {b2,b3} {b2,b3,b4}
Now I need to replace a lot of characters here. Every next column should contain values of previous. If the value is -, or {-} that means previous, and if it's not any of that, again, there need to be previous. Desired output:
source next1 next2 next3
b1 b1 b2 b1,b2,b3
b2,b3 b2,b3 b2,bb3 b2,b3,b4
I have tried something like this:
for val in df['source'].values:
if values=b1:
df['next1'].replace('{-},', 'b1,',regex=True, inplace=True)
df['next1'].replace('-,', 'b1,',regex=True, inplace=True)
etc But I have so much rows, and condiditons, so this works to long and not where precise, there are errors. Put one value (from replacing) to all rows.
I don't think there is a fast solution to your question, as string operations will always be slow-ish. Still, there is a better/faster one.
A straight-forward solution would be
for i in range(1, df.shape(1)): # here only order matters
df.iloc[:, i].str.replace('{-}', '-', inplace=True)
mask = df.iloc[:, i].str.contains('-')
df.iloc[mask, i].str.replace('{-}', df.iloc[mask, i-1], inplace=True)
with that, it is likely to be WAY faster to have all the columns as sets ({}) and operate on them as such.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.