[英]Better way to replace all values from column values pandas
I am looking for a better way to replace all column values with some other value.我正在寻找一种更好的方法来用其他值替换所有列值。
What I currently have is this:我目前拥有的是这样的:
gender_text = ['undefined', 'male', 'female']
df.loc[df['gender'] == 0, 'gender'] = gender_text[0]
df.loc[df['gender'] == 1, 'gender'] = gender_text[1]
df.loc[df['gender'] == 2, 'gender'] = gender_text[2]
df.head()
I was hoping for something a bit more elegant and use the gender
value (0, 1 or 2) as the index to choose from in gender_text
to have everything fit in one line.我希望有一些更优雅的东西,并使用
gender
值(0、1或2)作为在gender_text
中选择的索引,以使所有内容都放在一行中。
You can use a dictionary.您可以使用字典。
import pandas as pd
df = pd.DataFrame({'gender':[0,0,2,1,1,2]})
gender_text = {0:'undefined', 1:'male', 2:'female'}
df['gender'].map(gender_text)
# Out[33]:
# 0 undefined
# 1 undefined
# 2 female
# 3 male
# 4 male
# 5 female
# Name: gender, dtype: object
Alternatively, you can also pd.merge
, which might be better for larger datasets.或者,您也可以
pd.merge
,这对于较大的数据集可能会更好。
import pandas as pd
df = pd.DataFrame({'gender':[0,0,2,1,1,2]})
df_map = pd.DataFrame({'gender': [0, 1, 2], 'gender_new': ['undefined', 'male', 'female']})
df['gender'] = df.merge(df_map, on=['gender'])['gender_new']
You can define a dict
你可以定义一个
dict
replace_values = {0 :'undefined', 1 : 'male', 2 : 'female'}
And replace multiple values using replace
并使用
replace
替换多个值
df = df.replace({"gender": replace_values})
Alternatively, replace each value in the column using或者,使用替换列中的每个值
df.gender = df.gender.replace(0, 'undefined')
df.gender = df.gender.replace(1, 'male')
df.gender = df.gender.replace(2, 'female')
This is one of the usecase of the map
function (use np.select
for much faster performance)-这是
map
function 的用例之一(使用np.select
以获得更快的性能)-
gender_text = {0 :'undefined', 1 : 'male', 2 : 'female'}
df['gender'] = df['gender'].map(gender_text)
Or you can use apply
-或者你可以使用
apply
-
df['gender'] = df['gender'].apply(lambda x : gender_text[x])
Or you can use np.select
或者你可以使用
np.select
condlist = [df['gender'] == 0,
df['gender'] == 1,
df['gender'] == 2]
choicelist = ['undefined',
'male',
'female']
df['gender'] = np.select(condlist, choicelist)
Performace Comparison
. Performace Comparison
。 — > — >
%timeit df['gender'] = df['gender'].map(gender_text)
411 µs ± 10.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit df['gender'] = np.select(condlist,choicelist)
101 µs ± 322 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.