简体   繁体   English

在 python 的公共列上合并两个数据帧

[英]Merging two data frames on a common column in python

#Merging the individual datasets on a common field, the Loyalty Card Number (LYLTY_CARD_NBR) #在一个公共字段上合并各个数据集,忠诚卡号 (LYLTY_CARD_NBR)

Customer_Data = pd.merge(Transaction_data, Purchase_behaviour, on=Transaction_data['LYLTY_CARD_NBR'])

But I'm getting this error message但我收到此错误消息


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-47-13c21c7d1662> in <module>
      1 #Merging the individual datasets on a common field, the Loyalty Card Number (LYLTY_CARD_NBR)
----> 2 Customer_Data = pd.merge(Transaction_data, Purchase_behaviour, on=Transaction_data['LYLTY_CARD_NBR'])

~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     72     validate=None,
     73 ) -> "DataFrame":
---> 74     op = _MergeOperation(
     75         left,
     76         right,

~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
    650             self.right_join_keys,
    651             self.join_names,
--> 652         ) = self._get_merge_keys()
    653 
    654         # validate the merge keys dtypes. We may need to coerce

~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in _get_merge_keys(self)
    994                     else:
    995                         if rk is not None:
--> 996                             right_keys.append(right._get_label_or_level_values(rk))
    997                             join_names.append(rk)
    998                         else:

~\anaconda3\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis)
   1561             values = self.axes[axis].get_level_values(key)._values
   1562         else:
-> 1563             raise KeyError(key)
   1564 
   1565         # Check for duplicates

KeyError: 0          47142
1          55073
2          55073
3          58351
4          68193
           ...  
264831    242159
264832    244213
264833    256018
264834    257079
264835    265006
Name: LYLTY_CARD_NBR, Length: 264836, dtype: int64

If the column name 'LYLTY_CARD_NBR' is present in both Transaction_data and Purchase_behaviour dataframes, the value for option on= should be just the column name.如果 Transaction_data 和 Purchase_behaviour 数据帧中都存在列名“LYLTY_CARD_NBR”,则选项 on= 的值应该只是列名。

Customer_Data = pd.merge(Transaction_data, Purchase_behaviour, on='LYLTY_CARD_NBR')

I will recommend to pay attention to the type of merge to be performed: how{'left', 'right', 'outer', 'inner', 'cross'}, default is 'inner'.我会建议注意要执行的合并类型:how{'left', 'right', 'outer', 'inner', 'cross'},默认为'inner'。

Customer_Data = pd.merge(Transaction_data, Purchase_behaviour, on='LYLTY_CARD_NBR', how='inner')

Please see the reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html请看参考: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.ZFC35FDC70D5FC69D269E883A822C7A

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM