简体   繁体   English

根据 pandas 中另一个数据帧中的某些条件将值从一个数据帧拆分到另一个数据帧

[英]Splitting values from one data frame to another data frame based on certain conditions in another data frame in pandas

I have two data frames df1 and df2, and I want to adjust values from df2 to df1 based on conditions in df1.我有两个数据框 df1 和 df2,我想根据 df1 中的条件将值从 df2 调整为 df1。 The conditions are based on 4 different columns and different conditions for different IDs in df1 and I need to put values of one column from df2 in such a way that it splits the value from df2 and adjust it in df1, and the sum of each IDs value should match in both the data frames.这些条件基于 df1 中不同 ID 的 4 个不同列和不同条件,我需要将 df2 中一列的值以这样的方式从 df2 拆分并在 df1 中调整它,以及每个 ID 的总和值应该在两个数据帧中匹配。

So I have the following format of data:所以我有以下格式的数据:

在此处输入图像描述

在此处输入图像描述

I want to bring values from df2 to df1 and split it according to Start Day End Day, Start Time and End Time in df1 itself and the sum should be equal in both df1 and df2 for each ID.我想将 df2 中的值带到 df1 并根据 df1 本身的开始日结束日、开始时间和结束时间将其拆分,并且每个 ID 的 df1 和 df2 中的总和应该相等。

Expected Output预计 Output 在此处输入图像描述

here is the same data frame created in pandas.这是在 pandas 中创建的相同数据框。 these two tables are input values and I want the expected result as above.这两个表是输入值,我想要上面的预期结果。

df1 = pd.DataFrame({'ID': ["Ch1","Ch1","Ch1","Ch1","Ch1","Ch1","Ch2","Ch2","Ch2"],
               'Start Day': [1,1,1,6,6,6,1,1,1], 
               'End Day': [5,5,5,7,7,7,7,7,7], 
               'Start Time': [600,1200,1700,600,1200,1700,700,1200,1700], 
               'End Time': [1200,1700,2500,1200,1700,2500,1200,1700,2400]})
print(df1)

df2 = pd.DataFrame({'ID': ["Ch1","Ch1","Ch1","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2","Ch2"],
                    'Start Day': [1,1,1,1,1,1,1,1,1,1,1,1,6,1],
                    'End Day': [7,7,7,5,5,5,5,5,5,5,5,5,7,7],
                    'Start Time': [600,1200,1700,800,900,1000,1100,1200,1300,1900,2000,2200,700,700],
                    'End Time': [1200,1700,2500,900,1000,1100,1200,1300,1400,2000,2200,2300,2400,2400],
                    'Values':[1125,2250,1125,346.5,346.5,346.5,346.5,346.5,346.5,189,189,346.5,1795.5,346.5]})
print(df2)

Can somebody please help me with this.有人可以帮我解决这个问题。

Thanks in advance.提前致谢。

Like @Tomer S suggested, we really need some more input into how you need the "Values" to be adjusted from DF2 to DF1.就像@Tomer S 建议的那样,我们确实需要更多的输入来了解如何将“值”从 DF2 调整到 DF1。

Here is an approach to take the sum of the values from DF2 that match on CH2 and fall within the same window of time defined by the Start/End Days/Times:这是一种从 DF2 中获取与 CH2 匹配的值的总和并落在由开始/结束日期/时间定义的相同时间 window 内的方法:

def getValueFromDF2(df1row):
    return df2[
        (df2["ID"] == df1row["ID"]) 
        & (df2["Start Day"] >= df1row["Start Day"]) & (df2["End Day"] <= df1row["End Day"])
        & (df2["Start Time"] >= df1row["Start Time"]) & (df2["End Time"] <= df1row["End Time"])
    ]["Values"].sum()

df1["Values"] = df1.apply(lambda row: getValueFromDF2(row), axis=1)
print(df1)

Output: Output:

    ID  Start Day  End Day  Start Time  End Time  Values
0  Ch1          1        5         600      1200     0.0
1  Ch1          1        5        1200      1700     0.0
2  Ch1          1        5        1700      2500     0.0
3  Ch1          6        7         600      1200     0.0
4  Ch1          6        7        1200      1700     0.0
5  Ch1          6        7        1700      2500     0.0
6  Ch2          1        7         700      1200  1386.0
7  Ch2          1        7        1200      1700   693.0
8  Ch2          1        7        1700      2400   724.5

I don't think this is what you want though.我不认为这是你想要的。
I'm not sure how the sum of values in your output can equal the sum of values in df2, as you stated should be the case, regardless of how we aggregate or split them up, given that the Channel, Day, and Time windows in df2 are not all represented in df1 and vice-versa.我不确定 output 中的值之和如何等于 df2 中的值之和,正如你所说的那样,无论我们如何聚合或拆分它们,考虑到通道、日期和时间 windows在 df2 中并非全部在 df1 中表示,反之亦然。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM