简体   繁体   English

根据另一个数据帧对一个数据帧中的数据进行排序的最佳方法是什么?

[英]What is the best way to sort data from one data frame based on another data frame?

I have 2 datasets and I am trying to sort data from one dataset based on the values of the second dataset and create a new column in the first dataset.我有 2 个数据集,我正在尝试根据第二个数据集的值对一个数据集中的数据进行排序,并在第一个数据集中创建一个新列。 If the values in dataset 1 and 2 are matching then I want to populate the new column as true else false.如果数据集 1 和 2 中的值匹配,那么我想将新列填充为 true,否则为 false。 What is the best way of doing this in Python?在 Python 中执行此操作的最佳方法是什么? May code (given below) is not working.可能代码(如下所示)不起作用。

Data:数据:

df1
    Index   ID  type 1  type2
0   1   A.34    6.3 7.1
1   2   A.35    5.8 7.3
2   3   A.36    6.8 5.2
3   4   A.37    7.8 6.4
4   5   A.38    6.9 8.8


df2
    Index   ID  Type 2
0   1   A.55    6.7
1   2   A.35    3.6
2   3   A.69    5.8
3   4   A.34    9.2
4   5   A.38    7.7

# Required Output
df3
Index   ID  type 1  type2   Status
0   1   A.34    6.3 7.1 bad
1   2   A.35    5.8 7.3 good
2   3   A.36    4.1 2.6 bad
3   4   A.37    7.8 6.4 bad
4   5   A.38    6.9 8.8 good


# The code I wrote is giving me ‘bad’ for all the rows: 

Boolean = []
for x in df1.ID:
    if x == x in df2.ID:
        Boolean.append('good')
    else:
        Boolean.append('bad')
print (Boolean)

# Output obtained with code
Output: 
['bad', 'bad', 'bad', 'bad', 'bad']

Thank you.谢谢你。

I think this is what you are looking for:我认为这就是你要找的:

import pandas as pd

data1 = {
    'Index': [1, 2, 3, 4, 5],
    'ID': ['A.34', 'A.35', 'A.36', 'A.37', 'A.38'],
    'type 1': [6.3, 5.8, 6.8, 7.8, 6.9],
    'type2': [7.1, 7.3, 5.2, 6.4, 8.8]}

data2 = {
    'Index': [1, 2, 3, 4, 5],
    'ID': ['A.55', 'A.35', 'A.69', 'A.34', 'A.38'],
    'Type 2': [6.7, 3.6, 5.8, 9.2, 7.7]}

df1 = pd.DataFrame(data=data1)
df2 = pd.DataFrame(data=data2)

merge_cols = ['Index', 'ID']

df = pd.merge(df1, df2[merge_cols], how='left', left_on=merge_cols, right_on=merge_cols, indicator=True)

d = {'left_only':'bad', 'both':'good'}
df['_merge'] = df['_merge'].map(d)
df.rename(columns={'_merge': 'Status'}, inplace=True)
df

The output df looks like this: output df看起来像这样:

    Index   ID  type 1  type2   Status
0   1   A.34    6.3 7.1 bad
1   2   A.35    5.8 7.3 good
2   3   A.36    6.8 5.2 bad
3   4   A.37    7.8 6.4 bad
4   5   A.38    6.9 8.8 good

EDIT: edited to merge on both columns Index and ID编辑:编辑以合并IndexID两列

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据 pandas 中另一个数据帧中的某些条件将值从一个数据帧拆分到另一个数据帧 - Splitting values from one data frame to another data frame based on certain conditions in another data frame in pandas 根据另一个pandas数据框中的列排序对列进行排序 - Sort a column based on the sorting of a column from another pandas data frame 用基于其中一列的另一个数据框替换数据框中的值 - Replacing values in data frame with another data frame based on one of the columns 有没有一种快速的方法可以避免 for 循环根据来自另一个数据帧的数据生成新数据帧? - Is there a fast way to avoid for loops for generating a new data frame based on data from another data frame? 获取数据框中位数的最佳方法是什么 - What is the best way to get the Median of a Data Frame 用另一个数据框中的值更新一个数据框中的列 - Update columns in one data frame with values from another data frame 我需要基于列名从一个数据框到另一个数据框的值在 python pandas 中 - I need values from one data frame to another data frame in python pandas based on column name 根据python中另一个数据框中的日期对一个数据框中的每周总值求和 - Sum weekly totals of values from one data frame based on dates in another data frame in python pandas:从一个数据帧添加行到另一个数据帧? - pandas: Add row from one data frame to another data frame? 如何从一个数据框基于另一数据框获取行? - How to get rows from one data frame based on another data frame?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM