简体   繁体   English

从列值 pandas 替换所有值的更好方法

[英]Better way to replace all values from column values pandas

I am looking for a better way to replace all column values with some other value.我正在寻找一种更好的方法来用其他值替换所有列值。

What I currently have is this:我目前拥有的是这样的:

gender_text = ['undefined', 'male', 'female']

df.loc[df['gender'] == 0, 'gender'] = gender_text[0]
df.loc[df['gender'] == 1, 'gender'] = gender_text[1]
df.loc[df['gender'] == 2, 'gender'] = gender_text[2]

df.head()

I was hoping for something a bit more elegant and use the gender value (0, 1 or 2) as the index to choose from in gender_text to have everything fit in one line.我希望有一些更优雅的东西,并使用gender值(0、1或2)作为在gender_text中选择的索引,以使所有内容都放在一行中。

You can use a dictionary.您可以使用字典。

import pandas as pd
df = pd.DataFrame({'gender':[0,0,2,1,1,2]})

gender_text = {0:'undefined', 1:'male', 2:'female'}
df['gender'].map(gender_text)

# Out[33]: 
# 0    undefined
# 1    undefined
# 2       female
# 3         male
# 4         male
# 5       female
# Name: gender, dtype: object

Alternatively, you can also pd.merge , which might be better for larger datasets.或者,您也可以pd.merge ,这对于较大的数据集可能会更好。

import pandas as pd
df = pd.DataFrame({'gender':[0,0,2,1,1,2]})
df_map = pd.DataFrame({'gender': [0, 1, 2], 'gender_new': ['undefined', 'male', 'female']})

df['gender'] = df.merge(df_map, on=['gender'])['gender_new']

You can define a dict你可以定义一个dict

replace_values = {0 :'undefined', 1 : 'male', 2 : 'female'}

And replace multiple values using replace并使用replace替换多个值

df = df.replace({"gender": replace_values}) 

Alternatively, replace each value in the column using或者,使用替换列中的每个值

df.gender = df.gender.replace(0, 'undefined')
df.gender = df.gender.replace(1, 'male')
df.gender = df.gender.replace(2, 'female')

This is one of the usecase of the map function (use np.select for much faster performance)-这是map function 的用例之一(使用np.select以获得更快的性能)-

gender_text  = {0 :'undefined', 1 : 'male', 2 : 'female'}
df['gender'] = df['gender'].map(gender_text)

Or you can use apply -或者你可以使用apply -

df['gender'] =  df['gender'].apply(lambda x :  gender_text[x])

Or you can use np.select或者你可以使用np.select

condlist = [df['gender'] == 0,
            df['gender'] == 1,
            df['gender'] == 2]

choicelist = ['undefined',
              'male',
              'female']
df['gender'] = np.select(condlist, choicelist)

Performace Comparison . Performace Comparison — > — >

%timeit df['gender'] = df['gender'].map(gender_text)
411 µs ± 10.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit df['gender'] = np.select(condlist,choicelist)
101 µs ± 322 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM