[英]How to asign Datetime values of a dataframe to the next 15min Timestep without using min/max/sum or mean?
I've got a dataframe with power profiles.我有一个带有电源配置文件的 dataframe。 The dataframe shows start and endtime and consumed power during a transaction.
dataframe 显示事务期间的开始和结束时间以及消耗的功率。 It looks something like this:
它看起来像这样:
TransactionId![]() |
StartTime![]() |
EndTime![]() |
Power![]() |
---|---|---|---|
xyza123 ![]() |
2018.01.01 07:07:34 ![]() |
2018.01.01 07:34:08 ![]() |
70 ![]() |
hjker383 ![]() |
2018.01.01 10:21:00 ![]() |
2018.01.01 11:40:08 ![]() |
23 ![]() |
My Goal is to assign a new Start- and EndTime which are set at 15 min values.我的目标是分配一个新的开始时间和结束时间,它们设置为 15 分钟值。 Like so:
像这样:
TransactionId![]() |
StartTime![]() |
New Starttime![]() |
EndTime![]() |
New EndTime![]() |
Power![]() |
---|---|---|---|---|---|
xyza123 ![]() |
2018.01.01 07:07:34 ![]() |
2018.01.01 07:00:00 ![]() |
2018.01.01 07:34:08 ![]() |
2018.01.01 07:30:00 ![]() |
70 ![]() |
hjker383 ![]() |
2018.01.01 10:21:00 ![]() |
2018.01.01 10:30:00 ![]() |
2018.01.01 11:40:08 ![]() |
2018.01.01 11:45:00 ![]() |
23 ![]() |
The old Timestamps can be deleted afterwards.之后可以删除旧的时间戳。 However I don't want to aggregate them.
但是我不想聚合它们。 So I guess
所以我猜
df.groupby(pd.Grouper(key="StartTime", freq="15min")).sum()
df.groupby(pd.Grouper(key="StartTime", freq="15min")).sum()
or或者
df.groupby(pd.Grouper(key="StartEndtime", freq="15min")).mean()
df.groupby(pd.Grouper(key="StartEndtime", freq="15min")).mean()
etc. is not an option.等不是一种选择。 Another idea I had was creating a dataframe with values between
2018.01.01 00:00:00
and 2018.01.01 23:45:00
.我的另一个想法是创建一个 dataframe ,其值介于
2018.01.01 00:00:00
和2018.01.01 23:45:00
之间。 However I am not sure how to iterate true the two dataframes, to achieve my goal and if iteration true dataframes is a good idea in the first place.但是我不确定如何迭代真实的两个数据帧,以实现我的目标,如果迭代真实的数据帧首先是一个好主意。
You can use a function to convert a datetime to nearest 15 minute and then apply it to the column This function was inspired from this link :您可以使用 function 将日期时间转换为最近的 15 分钟,然后将其应用于列此 function 的灵感来自此链接:
import datetime
def convertToNearest15(tm):
discard = datetime.timedelta(minutes=tm.minute % 15,
seconds=tm.second,
microseconds=tm.microsecond)
tm -= discard
if discard >= datetime.timedelta(minutes=7.5):
tm += datetime.timedelta(minutes=15)
return tm
df['startTime'] = pd.to_datetime(df['startTime'])
df['newStartTime'] = df['startTime'].apply(convertToNearest15)
df['endTime'] = pd.to_datetime(df['endTime'])
df['newEndTime'] = df['endTime'].apply(convertToNearest15)
Here's the result:结果如下:
id | startTime | endTime | newStartTime | newEndTime
xyza123 | 2018-01-01 07:07:34 | 2018-01-01 10:21:00 | 2018-01-01 07:15:00 | 2018-01-01 10:15:00
hjker383| 2018-01-01 07:34:08 |2018-01-01 11:40:08 |2018-01-01 07:30:00 |2018-01-01 11:45:00
Resampling the Start and Endtime to 15 mins freq将开始和结束时间重新采样为 15 分钟频率
df['StartTime'] = pd.to_datetime(df.StartTime)
df['EndTime'] = pd.to_datetime(df.EndTime)
df = df.resample('15min', on='StartTime').first().dropna().rename_axis('New Starttime').reset_index()
df = df.resample('15min', on='EndTime').first().dropna().rename_axis('New EndTime').reset_index()
Output Output
Please rearrange the df columns as per requirement请根据要求重新排列 df 列
New EndTime New Starttime TransactionId StartTime EndTime Power
0 2018-01-01 07:30:00 2018-01-01 07:00:00 xyza123 2018-01-01 07:07:34 2018-01-01 07:34:08 70.0
1 2018-01-01 11:30:00 2018-01-01 10:15:00 hjker383 2018-01-01 10:21:00 2018-01-01 11:40:08 23.0
df['new_date'] = df['date'].apply(lambda x: x.replace(minute=(x.minute//15 * 15), second=0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.