使用 df2 中的值，其中行值与 df1 列名匹配

Question

I have two dataframes like these:我有两个这样的数据框：

df1: df1：

id  |  2019-03-01 |  2019-04-01 | 2019-05-01 | sum   
id1 |    42       |    69       | 96         | 868  
id2 |    15       |    21       | 76         | 321  
id3 |    34       |    45       | 35         | 675

df2: df2:

id  |  month| avail   
id1 |    3  | 10  
id2 |    4  | 54  
id2 |    5  | 34  
id3 |    5  | 33

I need to add value from avail column at every column where df2.month == df.columns.values[n].month and if theres no corresponding record =>add 0我需要在df2.month == df.columns.values[n].month的每一列中添加来自avail列的值，如果没有相应的记录 =>add 0

This was my attempt with np.where but I did not succeed:这是我对 np.where 的尝试，但我没有成功：

df1.columns.values[:-1] = pd.to_datetime(df1.columns.values[:-1])  
for c in np.arange(start = 0, stop = len(df1.columns[:-1]), step = 1):  
    df1['h'+str(c+1)] = df1.iloc[: , -1].add(np.where((df2.id.isin(df1.index))& 
                                             (df1.columns.values[c].month == df2.month), 
                                             df2.avail, 0)).sub(df1.iloc[:, c])
df1 = df1.filter(like = 'h').reset_index()

The expected output is:预期的 output 为：

id  |  h1 |  h2 | h3    
id1 |  836| 767 | 671 
id2 |  306| 339 | 297  
id3 |  641| 596 | 594

Answer 1

You can do it like with set_index and unstack on df2, set_index and drop on df1, then cumsum the difference between both result over the column, then add the column sum once reshaped with [:, None] , plus some rename and rename_axis .您可以像在 df2 上使用set_index和unstack一样，在 df1 上使用set_index和drop ，然后对列上的两个结果之间的差异cumsum ，然后在使用[:, None]重新整形后添加列总和，再加上一些rename和rename_axis 。

df_f = (df1['sum'].values[:, None] 
        + (df2.set_index(['id','month'])['avail'].unstack().fillna(0)
           - df1.set_index('id').drop('sum', axis=1)
                .rename(columns=lambda x: pd.to_datetime(x).month)).cumsum(axis=1))\
       .rename_axis(columns=None)\
       .reset_index()

print (df_f)
        id      3      4      5
0      id1  836.0  767.0  671.0
1      id2  306.0  339.0  297.0
2      id3  641.0  596.0  594.0

you may want to rename the column to fit your exact output您可能需要重命名该列以适合您的确切 output

Answer 2

Here is another approach:这是另一种方法：

# sample data
s1 = """id|2019-03-01|2019-04-01|2019-05-01|sum
id1|42|69|96|868
id2|15|21|76|321
id3|34|45|35|675"""
df1 = pd.read_csv(StringIO(s1), sep='|')

s2 = """id|month|avail
id1|3|10
id2|4|54
id2|5|34
id3|5|33"""
df2 = pd.read_csv(StringIO(s2), sep='|')
# end sample data

# convert columns to datetime and get the month
new_col = [pd.to_datetime(x).month for x in df1.columns[1:-1]]
df1 = df1.rename(columns=dict(zip(df1.columns[1:-1], new_col)))
# set index
df1 = df1.set_index('id')
# drop the last column
df3 = df1[df1.columns[:-1]]
# pivot df2 so months are the columns
p = df2.pivot('id', 'month', 'avail').fillna(0)
# concat and sum
con = pd.concat([-df3,p]).groupby(level=0).sum()
# add df1['sum'] to the first column
con[con.columns[0]] = con[con.columns[0]] + df1['sum']
# cumsum accross columns
print(con.cumsum(axis=1))

         3      4      5
id                      
id1  836.0  767.0  671.0
id2  306.0  339.0  297.0
id3  641.0  596.0  594.0

使用 df2 中的值，其中行值与 df1 列名匹配

问题描述

2 个解决方案

解决方案1
4 2020-04-22 17:31:07

解决方案2
2 2020-04-22 17:40:36

使用 df2 中的值，其中行值与 df1 列名匹配

问题描述

2 个解决方案

解决方案1 4 2020-04-22 17:31:07

解决方案2 2 2020-04-22 17:40:36

解决方案1
4 2020-04-22 17:31:07

解决方案2
2 2020-04-22 17:40:36