简体   繁体   English

熊猫数据框合并/链接值

[英]Pandas Dataframe Merge/Link values

I have 2 dataframes df1 and df2. 我有2个数据帧df1和df2。 Df1 contains date, date_block_num, item_id, item_cnt_day and has autoincreasing int index. Df1包含date,date_block_num,item_id,item_cnt_day,并且具有自动递增的int索引。 Df2 contains columns with dates such as 2013-01-01, 2013-01-02 and has item_id for index, I have initiallized it with zeros. Df2包含日期为2013-01-01、2013-01-02之类的列,并具有item_id作为索引,我已将其初始化为零。

My problem is that I want df2 to be filled with item_cnt_day values on the right item_id and date. 我的问题是我希望df2在正确的item_id和日期上填充item_cnt_day值。 There are also missing dates from df1 because there are days that nothing was sold. df1还缺少日期,因为有几天没有任何东西被出售。

print(df1)

              date  date_block_num  item_id  item_cnt_day
1       2013-01-03               0     2552           1.0
2       2013-01-05               0     2552           2.0
3       2013-01-06               0     2554           1.0
4       2013-01-15               0     2555           5.0
5       2013-01-10               0     2564           1.0
6       2013-01-02               0     2565           4.0
7       2013-01-04               0     2572           1.0

[186104 rows x 4 columns]


print(df2)

       2013-01-01  2013-01-02     ...      2015-10-30  2015-10-31
5652            0           0     ...               0           0
13071           0           0     ...               0           0
5671            0           0     ...               0           0
5672            0           0     ...               0           0
6675            0           0     ...               0           0
1514            0           0     ...               0           0
2331            0           0     ...               0           0
4271            0           0     ...               0           0

[198 rows x 1034 columns]

I believe you need pivot with reindex if second DataFrame is filled by 0 only: 我相信如果第二个DataFrame仅由0填充,则需要使用reindex pivot

df = (df1.pivot('item_id','date','item_cnt_day')
         .reindex(index=df2.index, columns=df2.columns)
         .fillna(0))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM