简体   繁体   English

如何根据df行中的值在d​​f_s_t中找到值并将结果保存在df ['s_t']中?

[英]How can I find values in df_s_t based on values in the rows of df and save the results in df['s_t']?

I have the following DataFrame (df): 我有以下DataFrame(df):

print(df.head())
        Date        Contract_Name   Maturity  ...  Call_Put Option_Price         t
0 2016-01-04  Aalberts Industries 2017-10-20  ...         C        12.29  0.049315
1 2016-01-05  Aalberts Industries 2017-10-20  ...         P         0.01  0.049315
2 2016-01-06  Aalberts Industries 2017-10-20  ...         C        11.29  0.049315
3 2016-01-04  WOLTERS-KLUWER      2017-10-20  ...         P         0.01  0.049315
4 2016-01-05  WOLTERS-KLUWER      2017-10-20  ...         C         9.29  0.049315

And I want to add a column df['s_t'] which needs data from df_s_t, this DataFrame looks as follows: 我想添加一个需要df_s_t数据的列df ['s_t'],这个DataFrame如下所示:

print(df_t_s.head())
        Date  Aalberts Industries  ...  UNILEVER WOLTERS-KLUWER
0 2016-01-04               30.125  ...    38.785         30.150
1 2016-01-05               30.095  ...    39.255         30.425
2 2016-01-06               29.405  ...    38.575         29.920
3 2016-01-07               29.005  ...    37.980         30.690
4 2016-01-08               28.930  ...    37.320         30.070

df['Date'] can be matched with df_s_t['Date'] and df['Contract_Name'] can be matched with the column names of df_s_t. df ['Date']可以与df_s_t ['Date']匹配,df ['Contract_Name']可以与df_s_t的列名匹配。

I hope some one can help me with creating df['s_t'] based on values from df_s_t (as described above). 我希望有人可以帮助我根据df_s_t的值创建df ['s_t'](如上所述)。 See also an example of df below 另请参见下面的df示例

print(df.head())
       Date        Contract_Name   Maturity  ...  Call_Put Option_Price         t  s_t
0 2016-01-04  Aalberts Industries 2017-10-20  ...         C        12.29  0.049315 30.125
1 2016-01-05  Aalberts Industries 2017-10-20  ...         P         0.01  0.049315 30.095
2 2016-01-06  Aalberts Industries 2017-10-20  ...         C        11.29  0.049315 29.405
3 2016-01-04  WOLTERS-KLUWER      2017-10-20  ...         P         0.01  0.049315 30.150
4 2016-01-05  WOLTERS-KLUWER      2017-10-20  ...         C         9.29  0.049315 30.425

Solution

df_s_t=pd.melt(df_s_t,id_vars=['Date'])
df_s_t=df_s_t.rename(columns={'variable':"Contract_Name"})
print(df_s_t.head())
        Date        Contract_Name   value
0 2016-01-04  Aalberts Industries  30.125
1 2016-01-05  Aalberts Industries  30.095
2 2016-01-06  Aalberts Industries  29.405
3 2016-01-07  Aalberts Industries  29.005
4 2016-01-08  Aalberts Industries   28.93

Now we can use merge: 现在我们可以使用merge:

df=pd.merge(df,df_s_t,on=['Date','Contract_Name'],how='left')
df=df.rename(columns={'value':'s_t'})
print(df.head())

      Date        Contract_Name   Maturity  ...  Option_Price         t  s_t
0 2017-10-02  Aalberts Industries 2017-10-20  ...         12.29  0.049315  41.29
1 2017-10-02  Aalberts Industries 2017-10-20  ...          0.01  0.049315  41.29
2 2017-10-02  Aalberts Industries 2017-10-20  ...         11.29  0.049315  41.29
3 2017-10-02  Aalberts Industries 2017-10-20  ...          0.01  0.049315  41.29
4 2017-10-02  Aalberts Industries 2017-10-20  ...          9.29  0.049315  41.29

Here is a solution for you. 这是一个适合您的解决方案。
1) I simplified your data, df1 has only 2 columns (Date and Contract_Name) / df2 has only 4 columns (Date / A / B / C) 1)我简化了你的数据,df1只有2列(Date和Contract_Name)/ df2只有4列(Date / A / B / C)
2) I melt the df2 (with the variable being called 'Contract_Name') and then groupby Date and Contract_Name 2)我融化了df2(变量被称为'Contract_Name'),然后是groupby Date和Contract_Name
3) I merge both dataframes 3)我合并了两个数据帧
4) Print 4)打印

import pandas as pd
df1 = pd.read_excel('Book1.xlsx', sheet_name='df1')
df2 = pd.melt(pd.read_excel('Book1.xlsx', sheet_name='df2'), id_vars=["Date"],var_name="Contract_Name", value_name="Value").groupby(['Date', 'Contract_Name']).sum().reset_index()
df = pd.merge(df1, df2, how='left', on=['Date','Contract_Name'])
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 DF1 中的索引值获取到 DF1 的列值与 DF2 的自定义多索引值匹配的位置? - How can I get the index values in DF1 to where DF1's column values match DF2's custom multiindex values? 如何在 2 个不同的列 df 中获得具有共同值的 df? - How to get a df with the common values in 2 different column df's? pandas 如何从 df2 获取 df1 的值,而 df1 和 df2 的值在列上重叠 - pandas how to get values from df2 for df1 while df1 and df2 have values overlapped on column(s) 如何根据python中另一个df的值过滤一个df - How to filter a df based on the values of another df in python 迭代 dataframe df_a 中的行并根据 Pyspark 中的 df_a 值更新 dataframe df_b - Iterate rows in dataframe df_a and update dataframe df_b based on df_a values in Pyspark Python,Pandas数据框:为什么我不能直接将数据分配给df.values = df.values / 100? - Python, Pandas Dataframes: Why can't I assign data directly to the df.values=df.values/100? 如何根据另一个df中另一列的值访问df的行 - how to access rows of df depending on values of another column in another df 如何根据列值和行值连接两个df? - How to join two df's on the basis of column values and row values? 根据字符串部分为 DF 行赋值 - Assign values to DF rows based on string portions 如何基于另一个df布尔值修改df? - how to modify a df based on another df boolean values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM