简体   繁体   English

如何根据该行中的条件为另一个数据框中的每一行选择一个数据框?

[英]How can I get a selection of one data frame for each row in another data frame based on conditions in that row?

I have the following 2 dataframes:我有以下2个数据框:

df1
   x  p  s
0  2  1  1
1  4  2  1
2  6  1  3
3  8  2  4

df2
     ts   1   2
0  1000  45  44
1  1001  46  46
2  1002  47  46
3  1003  48  48
4  1004  49  48
5  1005  50  50
6  1006  51  50
7  1007  52  52
8  1008  53  52

I would like to create a 3rd data frame with the same number of rows as df1 using values in df2 but based on the column values in df1.我想使用 df2 中的值但基于 df1 中的列值创建与 df1 具有相同行数的第三个数据框。 For example, for the first row of df1, I want to get every 'p' row from the 's' column up until the 'x' index in df2.例如,对于 df1 的第一行,我想从 's' 列中获取每个 'p' 行,直到 df2 中的 'x' 索引。 I know how to do that using df.apply() as shown below but it is too slow of an operation for the program I am writing.我知道如何使用 df.apply() 来做到这一点,如下所示,但是对于我正在编写的程序来说,它的操作太慢了。

def foo(row):
    return str(df2[row['p']].iloc[0:row['x']+1:row['s']].to_list())

df3 = df1.apply(lambda x: foo(x), axis=1)
df3
0            [45, 46, 47]
1    [44, 46, 46, 48, 48]
2            [45, 48, 51]
3            [44, 48, 52]

I'm not sure how large the datasets are, but try the following我不确定数据集有多大,但请尝试以下操作

# We need to do "CROSS JOIN" so we add a dummy key to both datasets to allow this
df1["temp_key"] = 0
df2["temp_key"] = 0

# Next we need to shift the index into the DataFrame and call it row_number
df2 = df2.reset_index().rename(columns={"index":"row_number"})

# Now we perform the "CROSS JOIN"
df = df1.merge(df2, on="temp_key").drop(columns=["temp_key"])

df1 should now have 7 columns: ["x", "p", "s", "ts", "1", "2", "row_number"] df1现在应该有 7 列: ["x", "p", "s", "ts", "1", "2", "row_number"]

# We can now apply the 'x' logic
df = df[df["row_number"] <= df["x"]]

# And then the 's' logic
df = df[df["row_number"].mod(df["p"]) == 0]

# Next we chose the appropriate column based on the p value
df["value"] = df["1"]
df.iloc[df["p"] == 2, "value"] = df["2"]

# Finally we can group the DataFrame by the 'x' value and create the lists
# Note: I've made the assumption that x is unique in df1
df = df.groupby(["x"])["value"].apply(list).reset_index()

This should return a DataFrame with two columns: ["x", "value"] with x corresponding to the x value in df1 and value being the list of values similar to df3 in your example.这应该返回一个包含两列的 DataFrame: ["x", "value"] ,其中x对应于 df1 中的x值,而value是类似于示例中的df3的值列表。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据行/列名称将一个数据框的列附加为另一个数据框的行? - How to append column of one data frame as row of another data frame based on row/column name? 如何从 Pandas 数据框中满足条件的位置获取前一行 - How can I get a previous row from where the conditions are met in data frame in Pandas 基于条件的数据框中的行选择 - Row selection in data frame based on condition 如何在特定列的每组数据框中获取当前行 [i] 之前的第一行和行之间的平均值? - How can i get average between first row and row before current row[i] in each group of data frame for specific column? 将数据框的每一行与另一个数据框的所有行进行比较 - Compare each row of a data frame with all rows of another one 如何遍历具有固定列的熊猫数据框的每一行并根据python中的条件执行操作? - How can I iterate over each row of a pandas data frame with a fixed column and perform operation on the basis of conditions in python? 如何确保在 class 中,从一个数据框中删除行会删除另一个数据框中的一行 - How to ensure within a class, deleting row from one data frame deletes a row in another data frame 检查另一个数据帧中是否存在一个数据帧中的行 - Check if a row in one data frame exist in another data frame pandas:从一个数据帧添加行到另一个数据帧? - pandas: Add row from one data frame to another data frame? 根据 pandas 中另一个数据帧中的某些条件将值从一个数据帧拆分到另一个数据帧 - Splitting values from one data frame to another data frame based on certain conditions in another data frame in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM