简体   繁体   English

python pandas比较两列并返回结果

[英]python pandas compare two columns and return results

I have two dataframes as below. 我有两个数据框,如下所示。 I added key column to both so that I can get Cartesian joint. 我向两者都添加了key列,以便获得笛卡尔关节。 I want to compare each value from df3 data frame's BEN_NAME2 column with df4 data frames's names2 column. 我想将df3数据帧的BEN_NAME2列中的每个值与df4数据帧的names2列进行比较。 My original plan was to perform Cartesian joint and to check whether there are any matching values. 我最初的计划是执行笛卡尔联合,并检查是否有匹配的值。 But both of my data frames are quite big and I am getting a memory error when I try to join. 但是我的两个数据帧都很大,尝试加入时出现内存错误。

I would like to do this operation one cell at a time from the column BEN_NAME2 . 我想一次从BEN_NAME2列一次执行一个操作。 I would like to create a new column in df3 which would give me row index from df3 if the exact match was found. 我想创建一个新的列df3这将使我行索引从df3如果精确匹配的结果。

For example, df3 will get two new columns, column match with values (0,1,0) and column matching_row_index with values (0,3,0) because jones value from the second row has a match in the df4 data frame 例如, df3将获得两个新列,列match的值为(0,1,0) ,列matching_row_index的值为(0,3,0)因为第二行的jones值在df4数据帧中具有匹配项

sales = [{'key': 0, 'BEN_NAME2': '150 jones'},
         {'key': 0,  'BEN_NAME2': 'jones'},
         {'key': 0,  'BEN_NAME2': '50'}]
df3 = pd.DataFrame(sales)

sales = [{'key': 0, 'names2': 'xyc'},
         {'key': 0,  'names2': 'fsdfa'},
         {'key': 0,  'names2': 'jones'}]
df4 = pd.DataFrame(sales)

my main goal is get output at a relatively fast rate. 我的主要目标是以相对较快的速度获得输出。

Use iloc() in loop. 在循环中使用iloc() This function allows you to navigate a data frame like an array. 此功能使您可以浏览像数组这样的数据框。

for i in range(0,10):
    A=df['Ben_names'].iloc[i]
    B= df['column_name'].iloc[i]
    #write the conditional statement using if and value to be inserted is var

    df.['column_nmae2'].iloc[i]=var

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM