简体   繁体   English

如何比较两个数据框并创建字典?

[英]How to compare two dataframes and create a dictionary?

I would like to retrieve each row of dataframe 2 that are included in the timestamps of dataframe 1. The timestamps of dataframe 1 go in pairs (11 being the beginning and 12 the end). I would like to retrieve each row of dataframe 2 that are included in the timestamps of dataframe 1. The timestamps of dataframe 1 go in pairs (11 being the beginning and 12 the end).

The goal is to create a dictionary of dataframes to plot each curve.目标是为每条曲线创建 plot 的数据帧字典。

df1
Timestamp               Num
2021-01-01 08:00:00     11
2021-01-01 09:00:00     12
2021-01-01 10:00:00     11
2021-01-01 11:00:00     12
2021-01-01 12:00:00     11
2021-01-01 13:00:00     12

And

df2
2021-01-01 07:30:00     66
2021-01-01 08:30:00     67
2021-01-01 08:45:00     67
2021-01-01 09:15:00     64
2021-01-01 10:30:00     65
2021-01-01 10:30:00     61
2021-01-01 10:45:00     68
2021-01-01 11:15:00     60
2021-01-01 12:30:00     66
2021-01-01 12:30:00     67
2021-01-01 12:45:00     67

I thought of making a mask, but it works only for one couple of timestamp of dataframe 1. I have the idea, but I can't write it in python:我想做一个面具,但它只适用于 dataframe 1 的几个时间戳。我有这个想法,但我不能在 python 中写它:

start = df1.iloc[::2, :]["Timestamp"] #to have each 11
end = df2.iloc[1::2, :]["Timestamp"] ##to have each 12

for each (start, end) in df2 : 
   create a df

dict = dict.append(df)

The final result must be:最终结果必须是:

final_df1
2021-01-01 07:30:00     66
2021-01-01 08:30:00     67
2021-01-01 08:45:00     67

final_df2
2021-01-01 10:30:00     65
2021-01-01 10:30:00     61
2021-01-01 10:45:00     68

final_df3
2021-01-01 12:30:00     66
2021-01-01 12:30:00     67
2021-01-01 12:45:00     67

Is it what you are looking for?是你要找的吗?

intervals = df1.assign(Group=df1.groupby('Num').cumcount()) \
               .pivot('Group', 'Num', 'Timestamp').apply(tuple, axis=1)

df2['Group'] = pd.cut(df2['Timestamp'], bins=pd.IntervalIndex.from_tuples(intervals))

Output: Output:

>>> df2
             Timestamp  Num                                       Group
0  2021-01-01 07:30:00   66                                         NaN
1  2021-01-01 08:30:00   67  (2021-01-01 08:00:00, 2021-01-01 09:00:00]
2  2021-01-01 08:45:00   67  (2021-01-01 08:00:00, 2021-01-01 09:00:00]
3  2021-01-01 09:15:00   64                                         NaN
4  2021-01-01 10:30:00   65  (2021-01-01 10:00:00, 2021-01-01 11:00:00]
5  2021-01-01 10:30:00   61  (2021-01-01 10:00:00, 2021-01-01 11:00:00]
6  2021-01-01 10:45:00   68  (2021-01-01 10:00:00, 2021-01-01 11:00:00]
7  2021-01-01 11:15:00   60                                         NaN
8  2021-01-01 12:30:00   66  (2021-01-01 12:00:00, 2021-01-01 13:00:00]
9  2021-01-01 12:30:00   67  (2021-01-01 12:00:00, 2021-01-01 13:00:00]
10 2021-01-01 12:45:00   67  (2021-01-01 12:00:00, 2021-01-01 13:00:00]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM