简体   繁体   English

如何根据一行是否包含另一行中的值组合数据框中的行

[英]How to combine rows in dataframe based on if a row contains a value in another row

I have one dataframe that looks like this, with additonal columns:我有一个看起来像这样的数据框,带有附加列:

ID         Paired_ID      ... 
123_1      123_2
123_2      123_1
456_1      456_2
456_2      456_1
789_1      789_2
789_2      789_1
789_3      789_4
789_4      789_3

What I would like to do is, for a particular ID, take the row where it's Paired_ID is the ID, and combine the two rows into one.我想要做的是,对于特定的 ID,取其 Paired_ID 为 ID 的行,并将两行合并为一行。 I've been trying to use pandas merge (我一直在尝试使用熊猫合并(

pd.merge(df, df, left_on="ID", right_on="Paired_ID"

but I'm getting duplicates and can't figure out how to get rid of them.但我得到了重复,无法弄清楚如何摆脱它们。

I would like:我想:

ID_x        Paired_ID_x      ID_y     Paired_ID_y  ...
123_1      123_2             123_2      123_1
456_1      456_2             456_2      456_1
789_1      789_2             789_2      789_1
789_3      789_4             789_4      789_3

The assumption is that every value in ID is in paired_ID.假设是 ID 中的每个值都在 paired_ID 中。

Compare the ends after the '_' delimiter and create two new dataframes,比较'_'分隔符后的结尾并创建两个新的数据帧,

Concat the dataframes on the columns axis to get your output.连接列轴上的数据框以获取输出。

#this extracts the ends of each value in ID and Paired_ID
A = df.ID.str.split('_').str[-1].astype(int)
B = df.Paired_ID.str.split('_').str[-1].astype(int)

#compare, filter df based on the comparison outcome and add suffixes
filter_1 = df.loc[A.le(B)].reset_index(drop=True).add_suffix('_x')
filter_2 = df.loc[~A.le(B)].reset_index(drop=True).add_suffix('_y')

#concatenate along the columns axis to get outcome
pd.concat([filter_1,filter_2],axis=1)


    ID_x    Paired_ID_x ID_y    Paired_ID_y
0   123_1   123_2       123_2   123_1
1   456_1   456_2       456_2   456_1
2   789_1   789_2       789_2   789_1
3   789_3   789_4       789_4   789_3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为 Pandas DataFrame 中的特定行组合行索引和行值(字符串) - Combine Row Index and Row Value (String) For Specific Rows in Pandas DataFrame 如何根据另一个 dataframe 的值过滤行 dataframe - How to filter row dataframe based on value of another dataframe 如何根据 dataframe 中的特定行值扩展行? - How to expand rows based on a particular row value in a dataframe? 根据行值删除 dataframe 中的行 - Dropping rows in dataframe based on row value 如何将 rest 一个行值转换为另一个 dataframe 的第 n 行值 - How to rest a row value to the nths rows values of another dataframe 如何根据数据框中的条件将行与前一行组合 - How to combine row with previous row based on condition in dataframe 如何根据行中的另一个值在 dataframe 中创建列(Python) - How to create a column in a dataframe based on another value in the row (Python) 我如何根据列单元格值和 append 查找一个 dataframe 上的一行到另一个 dataframe 上的一行? - How do i lookup a row on one dataframe based on the column cell value and append that to a row on another dataframe? 如何根据另一列的值将行分解为多行? - How to Explode row into multiple rows based on value of another column? 如何根据行中的特定值和熊猫中的另一列对行进行分组? - How to group rows based on specific value in a row and another column in pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM