简体   繁体   English

根据来自另一个数据帧的值更新数据帧

[英]Update a dataframe based on values from another dataframe

I have two data frames that look like this:我有两个如下所示的数据框:

            A   B
date        
2017-10-5   2   3
2017-10-6   5   5
2017-11-5   7   8
2017-11-6   11  13


             W1     W2
date        
2017-09-30  -0.2    0.01
2017-10-31  -0.003  0.04

I would like to create a new data frame that contains the following:我想创建一个包含以下内容的新数据框:

            W1 * A       W2 * B
date        
2017-10-5   -0.2 * 2     0.01 * 3
2017-10-6   -0.2 * 5     0.01 * 5
2017-11-5   -0.003 * 7   0.04 * 8
2017-11-6   -0.003 * 11  0.04 * 13

Use np.repeat on df2 and multiply.df2上使用np.repeat并乘以。 It looks like the index plays no part here.看起来索引在这里不起作用。

df1 = df1.mul(np.repeat(df2.values, 2, axis=0))

Or, more generally,或者,更一般地说,

df1 = df1.mul(np.repeat(df2.values, len(df1) // len(df2), axis=0))
print(df1)
               A     B
date                  
2017-10-5 -0.400  0.03
2017-10-6 -1.000  0.05
2017-11-5 -0.021  0.32
2017-11-6 -0.033  0.52

Where len(df1) // len(df2) computes the ratio of their sizes.其中len(df1) // len(df2)计算它们的大小比率。

In case the index does mean something, ie you have a value that changes on a certain date and you want to keep using it until it changes the next time.如果索引确实有意义,即您的值在某个日期发生变化,并且您想继续使用它,直到下一次发生变化。 You can then use the reindex command with the argument method='ffill' to create a dataframe that is aligned to the original dataframe.然后,您可以使用带有参数method='ffill'reindex命令来创建与原始数据帧对齐的数据帧。 Here's how it looks like:这是它的样子:

import pandas as pd
import dateutil

df = pd.DataFrame([['2017-10-5',2,3],
                  ['2017-10-6',5,5],
                  ['2017-11-5',7,8],
                  ['2017-11-6',11,13]],
                  columns = ['date','A','B'])
df['date'] = df['date'].apply(dateutil.parser.parse)
df = df.set_index('date')

wdf = pd.DataFrame([['2017-09-30',-0.2,0.01],
                    ['2017-10-31',-0.03,0.04]],
                     columns=['date','W1','W2'])
wdf['date'] = wdf['date'].apply(dateutil.parser.parse)
wdf = wdf.set_index('date')
wdf_r = wdf.reindex(df.index,
                    method='ffill')

res = df.drop(['A','B'],axis=1).assign(W1_x_A = wdf_r.W1 * df.A,
                                       W2_x_B = wdf_r.W2 * df.B)
print(res)

which outputs哪个输出

            W1_x_A  W2_x_B
date                      
2017-10-05   -0.40    0.03
2017-10-06   -1.00    0.05
2017-11-05   -0.21    0.32
2017-11-06   -0.33    0.52

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM