简体   繁体   English

Python groupby-根据其他列中的值创建一个新列

[英]Python groupby - Create a new column based on values in other columns

I have a very large dataframe. 我有一个非常大的数据框。
I wanna groupby the column 'id' first. 我想首先按“ id”列进行分组。
Then create a new column 'reply_time' based on the other existing columns. 然后根据其他现有列创建一个新列“ reply_time”。

import pandas as pd
import numpy as np

id = ['793601486525702000','793601486525702000','793601710614802000','793601355214561000','793601355214561000','793601355214561000','793601355214561000','788130215436230000','788130215436230000','788130215436230000','788130215436230000','788130215436230000']
time = ['11/1/2016 16:53','11/1/2016 16:53','11/1/2016 16:52','11/1/2016 16:55','11/1/2016 16:53','11/1/2016 16:53','11/1/2016 16:51','11/1/2016 3:09','11/1/2016 3:04','11/1/2016 2:36','11/1/2016 2:08','11/1/2016 0:28']
reply = ['3','3','0','3','3','2','1','3','2','3','3','1']

df = pd.DataFrame({"id": id, "time": time, "reply": reply})

        id                 time       reply 
793601486525702000  11/1/2016 16:53     3       
793601486525702000  11/1/2016 16:53     3       
793601710614802000  11/1/2016 16:52     0       
793601355214561000  11/1/2016 16:55     3       
793601355214561000  11/1/2016 16:53     3       
793601355214561000  11/1/2016 16:53     2       
793601355214561000  11/1/2016 16:51     1   
788130215436230000  11/1/2016 3:09      3       
788130215436230000  11/1/2016 3:04      2       
788130215436230000  11/1/2016 2:36      3       
788130215436230000  11/1/2016 2:08      3       
788130215436230000  11/1/2016 0:28      1   

There are two types of values in this new column 'reply_time'. 此新列“ reply_time”中有两种类型的值。

  1. 'time': groupby the column 'id' first, if reply = '1', return the 'time' value of reply = '2'. 'time':首先按'id'列进行分组,如果reply ='1',则返回reply ='2'的'time'值。
  2. 'na': If the above conditions aren't met, the remaining rows should be assigned to 'na'. 'na':如果不满足上述条件,则应将其余行分配给'na'。

In this case, my output data frame will be: 在这种情况下,我的输出数据帧将是:

        id                 time       reply   reply_time
793601486525702000  11/1/2016 16:53     3        na
793601486525702000  11/1/2016 16:53     3        na
793601710614802000  11/1/2016 16:52     0        na
793601355214561000  11/1/2016 16:55     3        na
793601355214561000  11/1/2016 16:53     3        na
793601355214561000  11/1/2016 16:53     2        na
793601355214561000  11/1/2016 16:51     1    11/1/2016 16:53
788130215436230000  11/1/2016 3:09      3        na
788130215436230000  11/1/2016 3:04      2        na
788130215436230000  11/1/2016 2:36      3        na
788130215436230000  11/1/2016 2:08      3        na
788130215436230000  11/1/2016 0:28      1    11/1/2016 3:04 

I haven't got any idea the best way to achieve this. 我不知道实现此目标的最佳方法。 Can anyone help? 有人可以帮忙吗?

Thanks in advance! 提前致谢!

切片后尝试mergereplace

yourdf=df.merge(df.query("reply=='2'").replace({'reply':{'2':'1'}}).rename(columns={'time':'reply_time'}),how='left')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 添加一个新列,其值基于另外两个列的 groupby 值 - Add a new column with values based on groupby values two other columns 根据其他列中的值创建新列 - Create new column based on values in other columns Python groupby - 根据其他列中的条件更改列值 - Python groupby - change column values based on conditions in other columns 如何根据阈值对多列进行分组并在Python中创建新列 - How to groupby multiple columns and create a new column in Python based on thresholds Groupby两列并根据python中的条件减法创建一个新列 - Groupby two columns and create a new column based on a conditional subtraction in python Pandas/Python:如何根据其他列的值创建新列并将额外条件应用于此新列 - Pandas/Python: How to create new column based on values from other columns and apply extra condition to this new column 如何根据其他列的值在数据框中创建新列? - How to create a new column in a dataframe based off values of other columns? 根据其他 pandas 列中列表中的值数创建新列? - Create new columns based on number of values in list in other pandas column? Pandas:根据其他列的文本值创建新列 - Pandas : Create new column based on text values of other columns 基于其他列的值创建新列的更好方法 - Better way to create a new column based on values of other columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM