如何使用熊猫左联接

Question

我有2个数据框，它看起来像这样：DF1：

Product, Region, ProductScore
AAA, R1,100
AAA, R2,100
BBB, R2,200
BBB, R3,200

DF2：

Region, RegionScore
R1,1
R2,2

我怎样才能将这2个加入1个数据帧，结果应该是这样的：

Product, Region, ProductScore, RegionScore
AAA, R1,100,1
AAA, R2,100,2
BBB, R2,200,2

非常感谢！

EDIT1：

我使用了df.merge（df_new）收到此错误消息：

  File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 4071, in merge
    suffixes=suffixes, copy=copy)
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 37, in merge
    copy=copy)
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 183, in __init__
    self.join_names) = self._get_merge_keys()
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 318, in _get_merge_keys
    self._validate_specification()
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 409, in _validate_specification
    if not self.right.columns.is_unique:
AttributeError: 'list' object has no attribute 'is_unique'

EDIT2：我意识到我的df_new是一个数据系列（通过使用groupby创建）而不是数据帧。 现在我已将其转换为数据帧，这里是信息：print（df.info（））Int64Index：1111个条目，0到1110数据列（共8列）：产品1111非空对象reviewuserId 1111非空对象ReviewprofileName 1111非空对象reviewelpfulness 881非空float64评论核心1111非空float64审查时间1111非空int64评论摘要1111非空对象reviewtext 1111非空对象dtypes：float64（2），int64（1），object （5）内存使用量：56.4+ KB无

print(df_new_2.info())

<class 'pandas.core.frame.DataFrame'>
Index: 1089 entries, A100Y8WSLFJN7Q to AZWBQPQN96SS6
Data columns (total 1 columns):
reviewelpfulnessbyuserid    864 non-null float64
dtypes: float64(1)
memory usage: 12.8+ KB
None

print(df.head())

      product    reviewuserId                         reviewprofileName  \
0  B003AI2VGA  A141HP4LYPWMSR          Brian E. Erland "Rainbow Sphinx"   
1  B003AI2VGA  A328S9RN3U5M68                                Grady Harp   
2  B003AI2VGA  A1I7QGUDP043DG                 Chrissy K. McVay "Writer"   
3  B003AI2VGA  A1M5405JH9THP9                              golgotha.gov   
4  B003AI2VGA   ATXL536YX71TR  KerrLines "&#34;MoviesMusicTheatre&#34;"   

   reviewelpfulness  reviewscore  reviewtime  \
0               1.0            3  1182729600   
1               1.0            3  1181952000   
2               0.8            5  1164844800   
3               1.0            3  1197158400   
4               1.0            3  1188345600   

                                       reviewsummary  \
0  There Is So Much Darkness Now ~ Come For The M...   
1  Worthwhile and Important Story Hampered by Poo...   
2                      This movie needed to be made.   
3                  distantly based on a real tragedy   
4  What's going on down in Juarez and shining a l...   

                                          reviewtext  
0  Synopsis: On the daily trek from Juarez Mexico...  
1  THE VIRGIN OF JUAREZ is based on true events s...  
2  The scenes in this film can be very disquietin...  
3  THE VIRGIN OF JUAREZ (2006)<br />directed by K...  
4  Informationally this SHOWTIME original is esse...

print(df_new_2.head())

                reviewelpfulnessbyuserid
reviewuserId                            
A100Y8WSLFJN7Q                       NaN
A103VZ3KDF2RT5                  0.555556
A1041HQGJDKFG5                  0.000000
A10FBJXMQPI0LL                  0.333333
A10LIHFA4SSK3F                  0.000000

现在错误消息看起来像这样：

  File "pandas\hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12245)
KeyError: 'reviewuserId'

打印这些信息后，我只需添加以下df_new_2 = df_new.to_frame().reset_index()即可解决此问题： df_new_2 = df_new.to_frame().reset_index()

Answer 1

您想要的不是左合并，因为您跳过了R3的行，您只想执行内部merge ：

In [120]:
df.merge(df1)

Out[120]:
  Product Region  ProductScore  RegionScore
0     AAA     R1           100            1
1     AAA     R2           100            2
2     BBB     R2           200            2

左合并将导致以下结果：

In [121]:
df.merge(df1, how='left')

Out[121]:
  Product Region  ProductScore  RegionScore
0     AAA     R1           100            1
1     AAA     R2           100            2
2     BBB     R2           200            2
3     BBB     R3           200          NaN

如何使用熊猫左联接

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-09-25 08:30:02

如何使用熊猫左联接

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-09-25 08:30:02

解决方案1
2 已采纳 2015-09-25 08:30:02