简体   繁体   English

如何使用 pandas 根据第三列中的值创建一个包含来自一列或另一列的值的新列?

[英]How do I create a new column with values from one column or another based on the value in a third column using pandas?

So, as an example, say I have the table:因此,举个例子,假设我有一张桌子:

Activity活动 Start Time开始时间 End Time时间结束
Red on红灯亮 2:15 2:15 3:00 3:00
Red on红灯亮 2:30 2:30 3:15 3:15
Red off红灯熄灭 1:45 1:45 2:30 2:30
Red off红灯熄灭 2:45 2:45 3:30 3:30

Based on the activity, I only care about one of two values - for 'Red on' I need to know the start time.根据活动,我只关心两个值之一 - 对于“红色”,我需要知道开始时间。 For 'Red off' I need to know the end time.对于“Red off”,我需要知道结束时间。

I want to create a fourth column just labeled 'Change Time', and based on whether 'Activity' is 'Red on' or 'Red off' I want to grab either the Start Time column value or the End Time column value for this fourth column.我想创建第四个标记为“更改时间”的列,并根据“活动”是“红色开启”还是“红色关闭”我想获取第四列的开始时间列值或结束时间列值柱子。 Later, I'm going to be discarding the Start Time and End Time columns and just keeping this newly-merged Change Time column.稍后,我将放弃 Start Time 和 End Time 列,只保留这个新合并的 Change Time 列。 For now, I'm just trying to work out how to create it.现在,我只是想弄清楚如何创建它。

With this example, the result I want is:通过这个例子,我想要的结果是:

Activity活动 Start Time开始时间 End Time时间结束 Change Time更改时间
Red on红灯亮 2:15 2:15 3:00 3:00 2:15 2:15
Red on红灯亮 2:30 2:30 3:15 3:15 2:30 2:30
Red off红灯熄灭 1:45 1:45 2:30 2:30 2:30 2:30
Red off红灯熄灭 2:45 2:45 3:30 3:30 3:30 3:30

Let's assume Red on and Red off are the only two possible values for the Activity column.假设 Red on 和 Red off 是 Activity 列仅有的两个可能值。 I thought I had an idea of how to do this, but both things I've tried have thrown errors.我以为我知道如何做到这一点,但我尝试过的两件事都引发了错误。

First, I tried:首先,我试过:

df['Change time'] = df['Activity'].apply(lambda x: df['Start Time'] if x == 'Red on' else df['End Time'])

And I got an error that said "ValueError: Wrong number of items passed 35, placement implies 1" - since I have 35 rows, that leads me to believe df['Start Time'] was trying to pass the whole column.我得到一个错误,说"ValueError: Wrong number of items passed 35, placement implies 1" - 因为我有 35 行,这让我相信 df['Start Time'] 试图通过整个列。 So, instead, I tried:所以,相反,我尝试了:

df['Change time'] = df['Activity'].apply(lambda x: df.loc['Start Time'] if x == 'Red on' else df.loc['End Time'])

And this one just gives me KeyError: 'Start Time' .而这个只是给了我KeyError: 'Start Time'

What am I missing to check the string value of the 'Activity' column and pass the value in the 'Start Time' column if 'Activity' == Red on and 'End time' if else?我缺少什么来检查“活动”列的字符串值,如果“活动”==红色,则传递“开始时间”列中的值,否则传递“结束时间”?

Assuming you only have "Red on"/"Red off" in Activity, this is a use case for numpy.where :假设您在 Activity 中只有“Red on”/“Red off”,这是numpy.where的用例:

df['Change time'] = np.where(df['Activity'].eq('Red on'),
                             df['Start Time'], df['End Time'])

If you potentially have other values, use numpy.select :如果您可能有其他值,请使用numpy.select

df['Change time'] = np.select([df['Activity'].eq('Red on'), 
                               df['Activity'].eq('Red off')],
                              [df['Start Time'], df['End Time']], pd.NA)

or loc :loc

df.loc[df['Activity'].eq('Red on'), 'Change time'] = df['Start Time']
df.loc[df['Activity'].eq('Red off'), 'Change time'] = df['End Time']

output: output:

  Activity Start Time End Time Change time
0   Red on       2:15     3:00        2:15
1   Red on       2:30     3:15        2:30
2  Red off       1:45     2:30        2:30
3  Red off       2:45     3:30        3:30

You can do this straightforwardly in pandas:你可以在pandas中直接这样做:

df['Change Time'] = df['Start Time'] 
df['Change Time'][df['Activity']=='Red off'] = df['End Time']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 pandas 中,如何根据一列中的唯一值创建列,然后根据另一列中的值填充它? - In pandas, how do I create columns out of unique values in one column, and then fill it based on values in another column? 根据一列的条件和熊猫中另一列的值创建新列 - Create new column based on condition from one column and the value from another column in pandas 如何根据 Pandas 中的第三列将值从一列复制到另一列? - How to copy a value from one column to another based on a third column in Pandas? 如何基于另一列的值在pandas dataframe列中创建新值 - How to create new values in a pandas dataframe column based on values from another column 如何根据一组条件在 PANDAS 中创建一个新列,然后将新列设置为另一个字段的值 - How can I create a new column in PANDAS based on a set of conditions and then setting the new column to the value of another field Pandas 基于非恒定值的第三列将值从一列复制到另一列 - Pandas copy value from one column to another based on a value third column that is not constant 如何根据另一列的最小值创建新列 - How to create a new column based on minimum value from another column 如何根据熊猫中另一列的底值将一列中的值求和? - How can I sum the values in one column based on the floor'd value of another column in pandas? 如何使用 Python 查看数据框中一列的值并根据这些结果在另一列中附加一个新值? - How do I use Python to look at the values of one column in a dataframe and append a new value in another column based on those results? 如何根据 pandas dataframe 中另一列的多个值在一列中创建值列表? - How do I create a list of values in a column from several values from another column in a pandas dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM