[英]How to calculate the mean of multiple Python Pandas datetime64[ns] values per row of the dataframe?
MC_schedule_df:
Act_Arr_Run-0 Act_Arr_Run-1 Act_Arr_Run-2 Act_Arr_Run-3
0 2005-08-05 05:15:08 2005-08-05 05:12:00 2005-08-05 05:16:50 2005-08-05 05:09:13
1 2005-08-05 06:18:30 2005-08-05 06:14:50 2005-08-05 06:14:29 2005-08-05 06:07:31
2 2005-08-05 06:22:17 2005-08-05 06:18:06 2005-08-05 06:26:25 2005-08-05 06:22:49
3 2005-08-05 08:52:56 2005-08-05 08:58:51 2005-08-05 09:05:27 2005-08-05 08:58:43
4 2005-08-05 13:04:24 2005-08-05 12:58:11 2005-08-05 13:05:41 2005-08-05 13:02:33
5 2005-08-05 13:22:08 2005-08-05 13:14:44 2005-08-05 13:09:08 2005-08-05 13:12:27
6 2005-08-05 14:26:38 2005-08-05 14:13:38 2005-08-05 14:17:31 2005-08-05 14:17:33
7 2005-08-05 18:08:41 2005-08-05 18:17:15 2005-08-05 18:14:21 2005-08-05 18:15:54
8 2005-08-05 19:46:15 2005-08-05 19:45:28 2005-08-05 19:46:20 2005-08-05 19:48:44
9 2005-08-05 23:13:53 2005-08-05 23:06:06 2005-08-05 23:06:25 2005-08-05 23:04:07
Hello,你好,
I have the dataframe (MC_schedule_df) shown above, consisting of the following datatypes:我有上面显示的 dataframe (MC_schedule_df),由以下数据类型组成:
In[1]: MC_schedule_df.dtypes
Out[1]:
Act_Arr_Run-0 datetime64[ns]
Act_Arr_Run-1 datetime64[ns]
Act_Arr_Run-2 datetime64[ns]
Act_Arr_Run-3 datetime64[ns]
dtype: object
The dataframe consists of rows of datetime values, of which i want to calculate the mean per row. dataframe 由多行日期时间值组成,我想计算每行的平均值。 I have tried the following code:
我尝试了以下代码:
MC_schedule_df = MC_schedule_df.assign(Average=MC_schedule_df.mean(axis=1))
This results in a column filled with NaN values.这导致一列填充了 NaN 值。 I have tried to find out why this does not work and thus have read loads of documentation.
我试图找出为什么这不起作用,因此阅读了大量文档。 My current guess is that Python is not able to 'destilate' the appropriate information from the datetime values to calculate the mean.
我目前的猜测是 Python 无法从日期时间值中“提取”适当的信息来计算平均值。
How to calculate the mean of these multiple Python Pandas datetime64[ns] values?如何计算这些多个 Python Pandas datetime64[ns] 值的平均值? Any help is appreciated.
任何帮助表示赞赏。
Edit: i tried the methods of Datetime objects with pandas mean function .编辑:我尝试了使用 pandas mean function 的 Datetime 对象的方法。 However, this method does not work, as i want to calculate the mean per row, and thus can not easily call the series.
但是,这种方法不起作用,因为我想计算每行的平均值,因此不能轻易调用该系列。
You can use what shown in this answer .您可以使用此答案中显示的内容。 As pointed out in the link, you cannot calculate the mean of a bunch of dates, the operation is not supported.
正如链接中所指出的,您无法计算一堆日期的平均值,不支持该操作。 But you can calculate the average of a bunch of timedeltas.
但是你可以计算一堆时间增量的平均值。
Use the pandas apply function to generalize it and apply it to a DataFrame instead of a Series.使用pandas 应用function 对其进行泛化并将其应用于 DataFrame 而不是系列。
mean_values = MC_schedule_df.apply(lambda dt : (dt - dt.min()).mean() + dt.min(), axis=1)
Using your sample dataframe, mean_values
is:使用您的样品 dataframe,
mean_values
是:
0 2005-08-05 05:13:17.750
1 2005-08-05 06:13:50.000
2 2005-08-05 06:22:24.250
3 2005-08-05 08:58:59.250
4 2005-08-05 13:02:42.250
5 2005-08-05 13:14:36.750
6 2005-08-05 14:18:50.000
7 2005-08-05 18:14:02.750
8 2005-08-05 19:46:41.750
9 2005-08-05 23:07:37.750
dtype: datetime64[ns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.