在熊猫系列中使用if / else根据条件创建新系列

Question

I have a pandas df. 我有熊猫df。 Say I have a column "activity" which can be "fun" or "work" and I want to convert it to an integer. 假设我有一列“活动”，可以是“有趣”或“工作”，我想将其转换为整数。 What I do is: 我要做的是：

df["activity_id"] = 1*(df["activity"]=="fun") + 2*(df["activity"]=="work")

This works, since I do not know how to put an if/else in there (and if you have 10 activities it can get complicated). 这行得通，因为我不知道如何在其中放置if / else（如果您有10个活动，它可能会变得很复杂）。

However, say I now have the opposite problem, and I want to convert from an id to a string, I cannot use this trick anymore because I cannot multiply a string with a Boolean. 但是，说我现在遇到了相反的问题，并且我想从id转换为字符串，我不能再使用此技巧了，因为我不能将字符串与布尔值相乘。 How do I do it? 我该怎么做？ Is there a way to use if/else? 有没有办法使用if / else？

Answer 1

You can create a dictionary with id as the key and the string as the value and then use the map series method to convert the integer to a string. 您可以创建一个ID为键，字符串为值的字典，然后使用map series方法将整数转换为字符串。

my_map = {1:'fun', 2:'work'}

df['activity']= df.activity_id.map(my_map)

Answer 2

You could instead convert your activity column to categorical dtype : 您可以改为将您的activity列转换为类别dtype ：

df['activity'] = pd.Categorical(df['activity'])

Then you would automatically have access to integer labels for the values via df['activity'].cat.codes . 然后，您将可以通过df['activity'].cat.codes自动访问值的整数标签。

import pandas as pd

df = pd.DataFrame({'activity':['fun','work','fun']})
df['activity'] = pd.Categorical(df['activity'])

print(df['activity'].cat.codes)
0    0
1    1
2    0
dtype: int8

Meanwhile the string values can still be accessed as before while saving memory : 同时，在保存内存时仍可以像以前一样访问字符串值：

print(df)

still yields 仍然产量

  activity
0      fun
1     work
2      fun

Answer 3

You could also use a dictionary and list comprehension to recalculate values for an entire column. 您还可以使用字典和列表推导来重新计算整个列的值。 This makes it easy to define the reverse mapping as well: 这也使定义反向映射变得容易：

>>> import pandas as pd
>>> forward_map = {'fun': 1, 'work': 2}
>>> reverse_map = {v: k for k, v in forward_map.iteritems()}
>>> df = pd.DataFrame(
    {'activity': ['work', 'work', 'fun', 'fun', 'work'],
     'detail': ['reports', 'coding', 'hiking', 'games', 'secret games']})
>>> df

  activity        detail
0     work       reports
1     work        coding
2      fun        hiking
3      fun         games
4     work  secret games

>>> df['activity'] = [forward_map[i] for i in df['activity']]
>>> df

   activity        detail
0         2       reports
1         2        coding
2         1        hiking
3         1         games
4         2  secret games

>>> df['activity'] = [reverse_map[i] for i in df['activity']]
>>> df

  activity        detail
0     work       reports
1     work        coding
2      fun        hiking
3      fun         games
4     work  secret games

在熊猫系列中使用if / else根据条件创建新系列

问题描述

3 个解决方案

解决方案1
5 已采纳 2016-12-19 22:31:49

解决方案2
2 2016-12-19 22:48:49

解决方案3
1 2016-12-19 22:57:48

在熊猫系列中使用if / else根据条件创建新系列

问题描述

3 个解决方案

解决方案1 5 已采纳 2016-12-19 22:31:49

解决方案2 2 2016-12-19 22:48:49

解决方案3 1 2016-12-19 22:57:48

解决方案1
5 已采纳 2016-12-19 22:31:49

解决方案2
2 2016-12-19 22:48:49

解决方案3
1 2016-12-19 22:57:48