简体   繁体   English

Python: Function 填写上一行一个非空值

[英]Python: Function to fill in the previous row of a non-null value

I have a dataset which is mostly timedelta values which relate to the shift length worked by emergency workers.我有一个数据集,它主要是与紧急工作人员工作的轮班长度相关的 timedelta 值。 If certain conditions were met, then the shift time was combined with the prior shift length time ['Combined Time']如果满足某些条件,则将班次时间与之前的班次长度时间相结合 ['Combined Time']

What I'm having trouble getting is the 'Final Times' column.我遇到的问题是“最终时间”栏目。 To not double count hours worked, if the shift was combined, for example row 3 and row 6, then the previous row should show NaT or 0:00 hours and any other row should return the the ['Shift Time'] value column.为了不重复计算工作时间,如果合并了班次,例如第 3 行和第 6 行,则前一行应显示 NaT 或 0:00 小时,任何其他行应返回 ['Shift Time'] 值列。

在此处输入图像描述

I've been trying to write a function which I can apply which can get the ['Final Times'] column, but am having trouble specifically with accessing the prior row to the 'Combined Time' value.我一直在尝试编写一个 function ,我可以应用它可以获得 ['Final Times'] 列,但在访问前一行到“Combined Time”值时遇到了麻烦。 What I've done so far gets me 2/3 but I'm completely lost on the part (second if or elif statement) to fill in the NaT/zero part.到目前为止,我所做的让我获得了 2/3,但我完全迷失了填写 NaT/零部分的部分(第二个 if 或 elif 语句)。

def my_func(x):

    if pd.notnull(x['Combined Time']):
        return x['Combined Time']      
    else:
        return x['Shift Time']
    
df['Final Times'] = df.apply(my_func, axis=1)   

Any assistance would be much appreciated!任何帮助将不胜感激!

Cheers干杯

You can use pandas where() + bfill() to fill previous row with a "check" value, so my_func() will test it to calculate "final times".您可以使用 pandas where() + bfill()用“检查”值填充前一行,因此my_func()将对其进行测试以计算“最终时间”。

df['Combined Time'] = df['Combined Time'].where(
                            df['Combined Time'].bfill(limit=1).isnull(), 
                            df['Combined Time'].fillna(pd.Timedelta('0:00:00')))

Modified function:修改 function:

def my_func(x):
    if pd.notnull(x['Combined Time']):
        if x['Combined Time'] == pd.Timedelta('0:00:00'):
            return pd.NaT
        else:
            return x['Combined Time']
    else:
        return x['Shift Time']

Apply:申请:

df['Final Times'] = df.apply(my_func, axis=1)
df

Result:结果:

    Shift Time       Combined Time      Final Times
0   0 days 13:00:00  NaT                0 days 13:00:00
1   0 days 07:00:00  0 days 00:00:00    NaT
2   0 days 01:19:00  0 days 08:19:48    0 days 08:19:48
3   0 days 07:00:00  NaT                0 days 07:00:00
4   0 days 14:00:00  0 days 00:00:00    NaT
5   0 days 02:00:00  0 days 16:00:00    0 days 16:00:00

Load data:加载数据:
(Please paste your data and format as code instead of screenshots) (请将您的数据和格式粘贴为代码而不是屏幕截图)

df = pd.DataFrame({'Shift Time': [pd.Timedelta('13:00:00'), 
                             pd.Timedelta('7:00:00'),
                             pd.Timedelta('1:19:00'),
                             pd.Timedelta('7:00:00'),
                             pd.Timedelta('14:00:00'),
                             pd.Timedelta('2:00:00')],
                  'Combined Time': [np.nan, np.nan, 
                               pd.Timedelta('8:19:48'), 
                               np.nan, 
                               np.nan, 
                               pd.Timedelta('16:00:00')],
                  'Final Times': np.nan * 6})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM