简体   繁体   English

合并两个带有 id 的数据帧

[英]Merge two dataframes with id

I want to merge two dataframes.我想合并两个数据框。 But when I do the following I got KeyError: "['available'] not in index" .但是当我执行以下操作时,我得到了KeyError: "['available'] not in index" I looked at Python Pandas merge only certain columns .我看着Python Pandas 只合并某些列 But what I'm doing wrong?但我做错了什么?

d = { 'listing_id': [1,2,3,4],
      'month': [1, 2, 3, 4],
      'price': [79.00, 80.00, 90.00, 20.00]}
df = pd.DataFrame(data=d)

d2 = {'id': [1, 2, 3, 4],
     'available': [5000, 8000,5000, 7000],
     'someotherstuff': [2,3,4,5]}
df2 = pd.DataFrame(data=d2)

df = pd.merge(df,df2[['id','available']],on='listing_id', how='left')

What I want

  listing_id  month  price   available 
0           1      1   79.0  5000
1           2      2   80.0  8000
2           3      3   90.0  5000
3           4      4   20.0  7000

Your solution won't work because your ID columns have different names.您的解决方案将不起作用,因为您的 ID 列具有不同的名称。 Try this:尝试这个:

df = pd.merge(df, df2, left_on='listing_id', right_on='id')

Firstly there is an extra space in your column available , so strip that out.首先,您的列中有一个额外的available空间,因此请strip它。

df2.columns
Out[10]: Index(['id', 'available '], dtype='object')

df2.columns = [col.strip() for col in df2.columns]
Out[15]: Index(['id', 'available'], dtype='object')

Then, your column that you want the merge to happen on is called differently in the two dataframes, so you need to specify left_on = and right_on = in the merge command:然后,您希望合并发生的列在两个数据left_on =调用方式不同,因此您需要在merge命令中指定left_on =right_on =

pd.merge(df,df2[['id','available']],left_on='listing_id', right_on = 'id',how='left').drop('id',axis=1)

   listing_id  month  price  available
0           1      1   79.0       5000
1           2      2   80.0       8000
2           3      3   90.0       5000
3           4      4   20.0       7000

You are telling pandas to merge, on = 'listing_id' , but do not have a listing_id in df2 .您告诉熊猫on = 'listing_id'进行合并,但在df2没有listing_id

Change the id to listing_id and this should work.id更改为listing_id ,这应该可以工作。 Also, no need to specify what columns you want to merge (no need for df2[['id','available']] .此外,无需指定要合并的列(无需df2[['id','available']]

d = { 'listing_id': [1,2,3,4],
      'month': [1, 2, 3, 4],
      'price': [79.00, 80.00, 90.00, 20.00]}
df = pd.DataFrame(data=d)
print(df['listing_id'])

d2 = {'listing_id': [1, 2, 3, 4],
     'available ': [5000, 8000,5000, 7000]}
df2 = pd.DataFrame(data=d2)

df = pd.merge(df,df2,on = 'listing_id', how='left')
print(df)

The output:输出:

Name: listing_id, dtype: int64
   listing_id  month  price  available 
0           1      1   79.0        5000
1           2      2   80.0        8000
2           3      3   90.0        5000
3           4      4   20.0        7000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM