简体   繁体   English

使用来自另一个数据帧的 if 条件在 Pandas 数据帧中创建一个新列

[英]create a new column in pandas dataframe using if condition from another dataframe

I have two dataframes as follows我有两个数据框如下

transactions

    buy_date    buy_price
0   2018-04-16  33.23
1   2018-05-09  33.51
2   2018-07-03  32.74
3   2018-08-02  33.68
4   2019-04-03  33.58

and

cii

    from_fy     to_fy       score
0   2001-04-01  2002-03-31  100
1   2002-04-01  2003-03-31  105
2   2003-04-01  2004-03-31  109
3   2004-04-01  2005-03-31  113
4   2005-04-01  2006-03-31  117

In the transactions dataframe I need to create a new columns cii_score based on the following condition在交易数据cii_score我需要根据以下条件创建一个新列cii_score

if transactions['buy_date'] is between cii['from_fy'] and cii['to_fy'] take the cii['score'] value for transactions['cii_score']如果transactions['buy_date']介于cii['from_fy']cii['to_fy']transactions['cii_score']cii['score']

I have tried list comprehension but it is no good.我试过列表理解,但它不好。

Request your inputs to tackle this.请求您的意见来解决这个问题。

First, we set up your dfs.首先,我们设置您的 dfs。 Note I modified the dates in transactions in this short example to make it more interesting注意我在这个简短的例子中修改了transactions中的日期以使其更有趣

import pandas as pd
from io import StringIO
trans_data = StringIO(
    """
,buy_date,buy_price
0,2001-04-16,33.23
1,2001-05-09,33.51
2,2002-07-03,32.74
3,2003-08-02,33.68
4,2003-04-03,33.58
    """
)

cii_data = StringIO(
    """
,from_fy,to_fy,score
0,2001-04-01,2002-03-31,100
1,2002-04-01,2003-03-31,105
2,2003-04-01,2004-03-31,109
3,2004-04-01,2005-03-31,113
4,2005-04-01,2006-03-31,117    
    """
)
tr_df = pd.read_csv(trans_data, index_col = 0)
tr_df['buy_date'] = pd.to_datetime(tr_df['buy_date'])

cii_df = pd.read_csv(cii_data, index_col = 0)
cii_df['from_fy'] = pd.to_datetime(cii_df['from_fy'])
cii_df['to_fy'] = pd.to_datetime(cii_df['to_fy'])

The main thing is the following calculation: for each row index of tr_df find the index of the row in cii_df that satisfies the condition.主要是下面的计算:对于tr_df每个行索引,找到cii_df中满足条件的行的索引。 The following calculates this match, each element of the list is equal to the appropriate row index of cii_df :下面计算这个匹配,列表的每个元素都等于cii_df的适当行索引:

match = [ [(f<=d) & (d<=e) for f,e in zip(cii_df['from_fy'],cii_df['to_fy']) ].index(True) for d in tr_df['buy_date']]
match

produces产生

[0, 0, 1, 2, 2]

now we can merge on this现在我们可以合并了

tr_df.merge(cii_df, left_on = np.array(match), right_index = True)

so that we get以便我们得到


    key_0 buy_date  buy_price   from_fy to_fy       score
0   0   2001-04-16  33.23   2001-04-01  2002-03-31  100
1   0   2001-05-09  33.51   2001-04-01  2002-03-31  100
2   1   2002-07-03  32.74   2002-04-01  2003-03-31  105
3   2   2003-08-02  33.68   2003-04-01  2004-03-31  109
4   2   2003-04-03  33.58   2003-04-01  2004-03-31  109

and the score column is what you asked for score列就是你要求的

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas 根据来自另一个 dataframe 的计数和条件创建新列 - Pandas Create new column based on a count and a condition from another dataframe Pandas 数据框根据另一列的条件创建新行 - Pandas dataframe create new rows based on condition from another column Pandas:添加新列并按条件从另一个dataframe赋值 - Pandas: Add new column and assigning value from another dataframe by condition 尝试使用Python / pandas基于来自另一个数据帧的列的内部总和来创建新的数据帧 - Trying to create a new dataframe based on internal sums of a column from another dataframe using Python/pandas 使用另一个数据框创建熊猫数据框列 - Create pandas dataframe column using another dataframe 使用if语句针对另一列在pandas数据框中创建新列 - Create new column in pandas dataframe using if statement against another column 如何从 dataframe 中的另一列按条件创建新组? - How to create new group by condition from another column in dataframe? 如何基于另一个DataFrame中的列在Pandas DataFrame中创建新列? - How to create a new column in a Pandas DataFrame based on a column in another DataFrame? 检查特定列是否大于另一列并根据 pandas dataframe 中的条件创建新列 - Check if specific column is greater than another column and create a new column based on condition in pandas dataframe 使用一个 Pandas 数据框填充另一个 Pandas 数据框的新列 - Using one pandas dataframe to populate new column in another pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM