[英]How to replace elements within a pandas dataframe column according to an ordered list?
Lets say I have this pandas dataframe: 可以说我有这个熊猫数据框:
index a b
1 'pika' 'dog'
2 'halo' 'cat'
3 'polo' 'dog'
4 'boat' 'man'
5 'moan' 'tan'
6 'nope' 'dog'
and I have a list like this: 我有一个这样的清单:
colors = ['black' , 'green', 'yellow']
How would I replace all the dog
in column b
with the elements 如何将
b
列中的所有dog
替换为元素
in the colors
list in the same order ? 在
colors
列表中的顺序相同 ?
Basically, I want it to look something like this: 基本上,我希望它看起来像这样:
index a b
1 'pika' 'black'
2 'halo' 'cat'
3 'polo' 'green'
4 'boat' 'man'
5 'moan' 'tan'
6 'nope' 'yellow'
Using pd.DataFrame.loc
and Boolean indexing: 使用
pd.DataFrame.loc
和布尔索引:
df.loc[df['b'].eq('dog'), 'b'] = colors
print(df)
index a b
0 1 pika black
1 2 halo cat
2 3 polo green
3 4 boat man
4 5 moan tan
5 6 nope yellow
Another way using numpy put 另一种使用numpy put的方式
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': ['pika', 'halo', 'polo', 'boat', 'moan', 'nope'],
'b': ['dog', 'cat', 'dog', 'man', 'tan', 'dog']})
colors = ['black' , 'green', 'yellow']
df DF
a b
0 pika dog
1 halo cat
2 polo dog
3 boat man
4 moan tan
5 nope dog
- -
# 'wrap' mode is not needed when replacement list is same
# size as the number of target values
np.put(df.b, np.where(df.b == 'dog')[0], colors, mode='wrap')
df DF
a b
0 pika black
1 halo cat
2 polo green
3 boat man
4 moan tan
5 nope yellow
Use itertools.cycle
, df.apply
, and lambda
使用
itertools.cycle
, df.apply
和lambda
In [100]: import itertools as it
In [101]: colors_gen = it.cycle(colors)
In [102]: df1['c'] = df1['b'].apply(lambda x: next(colors_gen) if x == 'dog' else x)
In [103]: df1
Out[103]:
a b c
0 pika dog black
1 halo cat cat
2 polo dog green
3 boat man man
4 moan tan tan
5 nope dog yellow
This will also work for larger DataFrames
这也适用于较大的
DataFrames
In [104]: df2 = pd.DataFrame({'a': ['pika', 'halo', 'polo', 'boat','moan','nope','etc','etc'], 'b':['dog','cat','dog','man','tan','dog','dog','dog']})
In [106]: df2['c'] = df2['b'].apply(lambda x: next(colors_gen) if x == 'dog' else x)
In [107]: df2
Out[107]:
a b c
0 pika dog black
1 halo cat cat
2 polo dog green
3 boat man man
4 moan tan tan
5 nope dog yellow
6 etc dog black
7 etc dog green
You can check with 您可以检查
n=(df.b=="'dog'").sum()
df.loc[df.b=="'dog'",'b']=(['black' , 'green', 'yellow']*(n//3))[:n]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.