繁体   English   中英

Pandas:根据另一个数据框的列创建列

[英]Pandas: Create a column as a function of another data frame's column

我在 pandas 中有一个数据框,其中一个变量由 Quarters 和 Years 组成,我想将它们分成 Months(对于 Quarters)和 Quarters and Months(对于 Years)。 原始数据框如下所示:

        volume  tenor      NewTenor
170     -3      quarter     Q1 21
516     -3      quarter     Q1 22
597     22      quarter     Q1 22
622     -3      year        Cal 21
625     22      quarter     Q2 22
657     -14     year        Cal 21
678     -16     quarter     Q1 22
704     -7      year        Cal 21
750     -16     quarter     Q1 22
934     -10     year        Cal 21

并使用以下代码:

def split_tenor(tenor):
    start, year = tenor.split(" ")
    if start == "Cal":
        months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
        year = int(year) + 1
    elif start == "Q1":
        months = ["Jan", "Feb", "Mar"]
    elif start == "Q2":
        months = ["Apr", "May", "Jun"]
    elif start == "Q3":
        months = ["Jul", "Aug", "Sep"]
    elif start == "Q4":
        months = ["Oct", "Nov", "Dec"]
    else:
        return tenor

    return [f"{m} {year}" for m in months]

my_data["NewTenor"] = my_data["NewTenor"].apply(split_tenor)
my_data = my_data.explode("NewTenor")

我设法把它变成这样:

     volume     tenor      NewTenor
170     -3      quarter     Jan 21
170     -3      quarter     Feb 21
170     -3      quarter     Mar 21
516     -3      quarter     Jan 22
516     -3      quarter     Feb 22
516     -3      quarter     Mar 22
597     22      quarter     Jan 22
597     22      quarter     Feb 22
597     22      quarter     Mar 22
622     -3      year        Jan 22
622     -3      year        Feb 22
622     -3      year        Mar 22
622     -3      year        Q2 22
622     -3      year        Q3 22
622     -3      year        Q4 22
625     22      quarter     Apr 22
625     22      quarter     May 22
625     22      quarter     Jun 22
657     -14     year        Jan 22
657     -14     year        Feb 22

然而,数据框中的数据volume保持不变,尽管它应该在整个期间相应地拆分(例如,当一个季度分成几个月时,相应的数据量也应该分成三个相等的数据量)。

有人可以在我拆分期间的同时帮我拆分音量吗? 谢谢你。

编辑

正确的代码应该返回如下内容:

df_1 = pd.DataFrame({'volume':[12, 9],
               'tenor':['year', 'quarter'],
               'NewTenor':['Cal 21', 'Q2 22']})

灵魂回归:

df_2 = pd.DataFrame({'volume':[1, 1, 1, 3, 3, 3, 3, 3, 3],
               'tenor':['year', 'year', 'year', 'year', 'year', 'year', 'quarter', 'quarter', 'quarter'],
               'NewTenor':['Jan 21', 'Feb 21', 'Mar 21', 'Q2 21', 'Q3 21', 'Q4 21', 'Apr 22', 'May 22', 'Jun 22']})

您可以通过修改现有代码来一次创建两列来做到这一点:

df = pd.DataFrame({'volume':[12, 9],
                   'tenor':['year', 'quarter'],
                   'NewTenor':['Cal 21', 'Q2 22']})
def split_tenor(row):
    start, year = row['NewTenor'].split(" ")
    if start == "Cal":
        months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
        year = int(year) + 1
    elif start == "Q1":
        months = ["Jan", "Feb", "Mar"]
    elif start == "Q2":
        months = ["Apr", "May", "Jun"]
    elif start == "Q3":
        months = ["Jul", "Aug", "Sep"]
    elif start == "Q4":
        months = ["Oct", "Nov", "Dec"]
    else:
        return tenor, tenor

    if start == "Cal":
        split_vol = [row['volume']/12] * 3 + [row['volume']/4] * 3
    else:
        split_vol = [row['volume']/len(months)] * len(months)
    
    return [f"{m} {year}" for m in months], split_vol

df["NewTenor"], df["NewVolume"] = zip(*df[["volume", "NewTenor"]].apply(split_tenor, axis = 1))
# in case you are using earlier version of pandas, change this line to explode two columns separately
df = df.explode(["NewTenor", "NewVolume"])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM