![](/img/trans.png)
[英]create a 'group number' column for a pandas data frame column of '0' and '1' s
[英]Pandas: Create a column as a function of another data frame's column
我在 pandas 中有一个数据框,其中一个变量由 Quarters 和 Years 组成,我想将它们分成 Months(对于 Quarters)和 Quarters and Months(对于 Years)。 原始数据框如下所示:
volume tenor NewTenor
170 -3 quarter Q1 21
516 -3 quarter Q1 22
597 22 quarter Q1 22
622 -3 year Cal 21
625 22 quarter Q2 22
657 -14 year Cal 21
678 -16 quarter Q1 22
704 -7 year Cal 21
750 -16 quarter Q1 22
934 -10 year Cal 21
并使用以下代码:
def split_tenor(tenor):
start, year = tenor.split(" ")
if start == "Cal":
months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
year = int(year) + 1
elif start == "Q1":
months = ["Jan", "Feb", "Mar"]
elif start == "Q2":
months = ["Apr", "May", "Jun"]
elif start == "Q3":
months = ["Jul", "Aug", "Sep"]
elif start == "Q4":
months = ["Oct", "Nov", "Dec"]
else:
return tenor
return [f"{m} {year}" for m in months]
my_data["NewTenor"] = my_data["NewTenor"].apply(split_tenor)
my_data = my_data.explode("NewTenor")
我设法把它变成这样:
volume tenor NewTenor
170 -3 quarter Jan 21
170 -3 quarter Feb 21
170 -3 quarter Mar 21
516 -3 quarter Jan 22
516 -3 quarter Feb 22
516 -3 quarter Mar 22
597 22 quarter Jan 22
597 22 quarter Feb 22
597 22 quarter Mar 22
622 -3 year Jan 22
622 -3 year Feb 22
622 -3 year Mar 22
622 -3 year Q2 22
622 -3 year Q3 22
622 -3 year Q4 22
625 22 quarter Apr 22
625 22 quarter May 22
625 22 quarter Jun 22
657 -14 year Jan 22
657 -14 year Feb 22
然而,数据框中的数据volume
保持不变,尽管它应该在整个期间相应地拆分(例如,当一个季度分成几个月时,相应的数据量也应该分成三个相等的数据量)。
有人可以在我拆分期间的同时帮我拆分音量吗? 谢谢你。
编辑
正确的代码应该返回如下内容:
df_1 = pd.DataFrame({'volume':[12, 9],
'tenor':['year', 'quarter'],
'NewTenor':['Cal 21', 'Q2 22']})
灵魂回归:
df_2 = pd.DataFrame({'volume':[1, 1, 1, 3, 3, 3, 3, 3, 3],
'tenor':['year', 'year', 'year', 'year', 'year', 'year', 'quarter', 'quarter', 'quarter'],
'NewTenor':['Jan 21', 'Feb 21', 'Mar 21', 'Q2 21', 'Q3 21', 'Q4 21', 'Apr 22', 'May 22', 'Jun 22']})
您可以通过修改现有代码来一次创建两列来做到这一点:
df = pd.DataFrame({'volume':[12, 9],
'tenor':['year', 'quarter'],
'NewTenor':['Cal 21', 'Q2 22']})
def split_tenor(row):
start, year = row['NewTenor'].split(" ")
if start == "Cal":
months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
year = int(year) + 1
elif start == "Q1":
months = ["Jan", "Feb", "Mar"]
elif start == "Q2":
months = ["Apr", "May", "Jun"]
elif start == "Q3":
months = ["Jul", "Aug", "Sep"]
elif start == "Q4":
months = ["Oct", "Nov", "Dec"]
else:
return tenor, tenor
if start == "Cal":
split_vol = [row['volume']/12] * 3 + [row['volume']/4] * 3
else:
split_vol = [row['volume']/len(months)] * len(months)
return [f"{m} {year}" for m in months], split_vol
df["NewTenor"], df["NewVolume"] = zip(*df[["volume", "NewTenor"]].apply(split_tenor, axis = 1))
# in case you are using earlier version of pandas, change this line to explode two columns separately
df = df.explode(["NewTenor", "NewVolume"])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.