将两个 pandas 数据帧与两个条件结合起来

Question

There are two pandas dataframes I have which I would like to combine with checking of two conditionals.我有两个 pandas 数据帧，我想将它们与两个条件的检查结合起来。

Dataframe1:数据框1：

import pandas as pd 
data = [['Z085', '2020-08', 1.33], ['Z086', '2020-08', 1.83], ['Z086', '2020-09', 1.39]] 
df1 = pd.DataFrame(data, columns = ['SN', 'Date', 'Value'])

Dataframe2:数据框2：

data = [['Z085', '2020-08', 0.34], ['Z085', '2020-09', 0.83], ['Z086', '2020-09', 0.29]] 
df2 = pd.DataFrame(data, columns = ['SN', 'Date', 'ValueX']) 
df2

I would like to merge or append or join them in order to get the folowing dataframe: The values ("Value" and "ValueX") are being add if both "SN" and "Date" are equal.我想合并或 append 或加入它们以获得以下 dataframe：如果“SN”和“Date”相等，则添加值（“Value”和“ValueX”）。

I am not sure, if a new dataframe is required or to map the df2 to the df1.我不确定，如果需要新的 dataframe 或 map，df2 到 df1。

This is what i have tried:这是我试过的：

df1['ValueX'] = df1[('Date', 'SN')].map(df2_mean.set_index('Date', 'SN')['ValueX'])

With one conditional (for example: Date) it works ok, but i am not able to set up two conditionals.使用一个条件（例如：日期）它可以正常工作，但我无法设置两个条件。

Answer 1

This is simply a merge() operation.这只是一个merge()操作。 Don't call the columns "conditionals", just say "merge on the columns SN, Date".不要将列称为“条件”，只需说“在 SN、日期列上合并”。

However pandas (v1.1.4) has a bug (its default is to use reversed ie 'ascending') key order when doing the sort) so you can't rely on it;但是 pandas (v1.1.4) 有一个错误（它的默认设置是在进行排序时使用相反的键顺序，即“升序”）所以你不能依赖它； note below it gets sorted by 'Date' then 'SN', ie wrong-way-around:请注意下面它按“日期”然后“SN”排序，即错误的方式：

>>> dfnew_bad = df1.merge(df2, on=['SN','Date'], how='outer')

     SN     Date  Value  ValueX
0  Z085  2020-08   1.33    0.34
1  Z086  2020-08   1.83     NaN
2  Z086  2020-09   1.39    0.29
3  Z085  2020-09    NaN    0.83

So in your case to get the correct order by SN then Date :所以在你的情况下通过 SN 然后 Date 获得正确的订单：

dfnew_good = df1.merge(df2, on=['SN','Date'], how='outer', sort=False).sort_values(['SN', 'Date'])
     SN     Date  Value  ValueX
0  Z085  2020-08   1.33    0.34
3  Z085  2020-09    NaN    0.83
1  Z086  2020-08   1.83     NaN
2  Z086  2020-09   1.39    0.29

Note that there's a flag .sort_values(ascending=True) but not pd.merge() You could also workaround by doing pd.merge(..., sort=False) then dfnew_workaround.sort_index(..., inplace=True)请注意，有一个标志.sort_values(ascending=True)但不是pd.merge()您也可以通过执行pd.merge(..., sort=False)然后dfnew_workaround.sort_index(..., inplace=True)来解决

Answer 2

Method 1: merge :方法一： merge ：

df_new = df1.merge(df2, on=['SN','Date'],how='outer', sort=True)
print(df_new)

Method 2: join :方法二： join ：

df_new = df1.join(df2.set_index(['SN','Date']), on=['SN','Date'],how='outer', sort=True)
print(df_new)

In this case, one more possible way would be to use pd.concat :在这种情况下，另一种可能的方法是使用pd.concat ：

df_new = pd.concat([df1.set_index(['SN','Date']),df2.set_index(['SN','Date'])],axis=1).reset_index()

Output in either case : Output 在任何一种情况下：

     SN     Date  Value  ValueX
0  Z085  2020-08   1.33    0.34
3  Z085  2020-09    NaN    0.83
1  Z086  2020-08   1.83     NaN
2  Z086  2020-09   1.39    0.29

将两个 pandas 数据帧与两个条件结合起来

问题描述

2 个解决方案

解决方案1
1 2020-10-09 18:39:52

解决方案2
1 已采纳 2020-10-09 18:40:44

将两个 pandas 数据帧与两个条件结合起来

问题描述

2 个解决方案

解决方案1 1 2020-10-09 18:39:52

解决方案2 1 已采纳 2020-10-09 18:40:44

解决方案1
1 2020-10-09 18:39:52

解决方案2
1 已采纳 2020-10-09 18:40:44