[英]How to make this row-wise operation performant (python)?
My issue is very simple, but I just can't wrap my head around it: I have two dataframes: 我的问题很简单,但是我无法解决这个问题:我有两个数据框:
dataframe
with two columns: Timestamp
and DataValue
dataframe
: Timestamp
和DataValue
dataframe
with start
, end
timestamps and a label start
, end
时间戳和标签的时间间隔dataframe
What I want to do: 我想做的事:
Add a third column to the timeseries that yields
the labels according to the time interval dataframe
. 在时间序列中添加第三列,该列将根据时间间隔
dataframe
yields
标签。
Every timepoint
needs to have an assigned label designated by the time interval dataframe
. 每个
timepoint
需要有一个由时间间隔dataframe
指定的分配标签。
This code works: 此代码有效:
TimeSeries_labelled = TimeSeries.copy(deep=True)
TimeSeries_labelled["State"] = 0
for index in Timeintervals_States.index:
for entry in TimeSeries_labelled.index:
if Timeintervals_States.loc[index,"start"] <= TimeSeries_labelled.loc[entry, "Timestamp"] <= Timeintervals_States.loc[index,"end"]:
TimeSeries_labelled.loc[entry, "State"] = Timeintervals_States.loc[index,"state"]
But it is really slow. 但这确实很慢。 I tried to make it shorter and faster with pyhton built in filter codes, but failed miserably.
我尝试使用内置于过滤器代码中的pyhton将其缩短和缩短,但失败了。 Please help!
请帮忙!
I don't really know about TimeSeries, with a dataframe containing timestamps as datetime object you could use something like the following : 我不太了解TimeSeries,使用包含时间戳记的数据框作为datetime对象,您可以使用以下内容:
import pandas as pd
#Create the thrid column in the target dataframe
df_timeseries['label'] = pd.Series('',index=df_timeseries.index)
#Loop over the dataframe containing start and end timestamps
for index,row in df_start_end.iterrows():
#Create a boolean mask to filter data
mask = (df_timeseries['timestamp'] > row['start']) & (df_timeseries['timestamp'] < row['end'])
df_timeseries.loc[mask,'label'] = row['label']
This will make the rows your timeseries dataframe that match the condition of the mask have the label of the row, for each rows of your dataframe containing start & end timestamps 这将使与掩码条件匹配的时间序列数据帧的行具有该行的标签,因为数据帧的每行都包含开始和结束时间戳记
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.