Pandas：根據另一個數據框的行創建數據框

Question

我有一個問題，一個有點復雜的問題。

我有一個這樣的數據框：

   commodity_name first_delivery_date last_delivery_date  last_trading_date   tenor   delivery_window new_tenor    Vol
   <chr>          <dttm>              <dttm>              <dttm>              <chr>   <chr>           <chr>      <int>
  1 oil            2021-06-01 00:00:00 2021-06-30 00:00:00 2021-04-30 00:00:00 month   Jun 21          Jun 21     29000
  2 gold           2022-03-01 00:00:00 2022-03-31 00:00:00 2022-02-28 00:00:00 month   Mar 22          Mar 22      -800
  3 oil            2021-07-01 00:00:00 2021-07-31 00:00:00 2021-05-31 00:00:00 month   Jul 21          Jul 21    -21000
  4 gold           2021-09-01 00:00:00 2021-09-30 00:00:00 2021-08-31 00:00:00 month   Sep 21          Sep 21      1100
  5 gold           2021-02-01 00:00:00 2021-02-28 00:00:00 2021-01-29 00:00:00 month   Feb 21          Feb 21     -3000
  6 depower        2021-01-01 00:00:00 2021-01-31 00:00:00 2020-12-30 00:00:00 quarter Jan 21          Q1 21         -3
  7 oil            2022-04-01 00:00:00 2022-04-30 00:00:00 2022-02-28 00:00:00 month   Apr 22          Apr 22     23000
  8 czpower        2023-02-01 00:00:00 2023-02-28 00:00:00 2023-01-30 00:00:00 quarter Feb 23          Q1 23         26
  9 oil            2021-02-01 00:00:00 2021-02-28 00:00:00 2020-12-31 00:00:00 quarter Feb 21          Q1 21     -17000
 10 gold           2021-05-01 00:00:00 2021-05-31 00:00:00 2021-04-30 00:00:00 month   May 21          May 21      2400

我想從中創建另一個數據框，基於以下條件：

對於 Year YY ，如果舊數據框中的new_tenor為Q1 YY ：在新數據框中創建三行，其中new_tenor為Jan YY 、 Feb YY和Mar YY 。 所有其他變量保持不變；
如果舊數據框中的new_tenor為Q2 YY ：在新數據框中創建三行，其中new_tenor為Apr YY 、 May YY和Jun YY 。 所有其他變量保持不變；
如果舊數據框中的new_tenor為Q3 YY ：在新數據框中創建三行，其中new_tenor為Jul YY 、 Aug YY和Sep YY 。 所有其他變量保持不變；
如果舊數據框中的new_tenor為Q4 YY ：在新數據框中創建三行，其中new_tenor為Oct YY 、 Nov YY和Dec YY 。 所有其他變量保持不變；
如果new_tenor在舊數據框中為Cal YY ：在新數據框中創建六行，其中new_tenor為Jan YY+1 、 Feb YY+1 、 Mar YY+1 、 Q2 YY+1 、 Q3 YY+1和Q4 YY+1 ，分別。 所有其他變量保持不變；

問題很簡單，主要取決於YY的值，新數據框中的其他所有內容都與舊數據框中的相同。

我嘗試使用以下代碼解決問題：

my_df = []


for index, row in ss.iterrows():
    
# d = row["NewTenor"].split()

# year = d[1]
        
print(year)

if "Cal" in row["NewTenor"]:
    
    # Go to next year
    
    # Add Jan, Feb, and Mar
    
    temp_1 = row
    
    temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Jan','21':'22','22':'23','23':'24'})
    
    temp_2 = row
    
    temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Cal':'Feb','21':'22','22':'23','23':'24'})
    
    temp_3 = row
    
    temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Cal':'Mar','21':'22','22':'23','23':'24'})
    
    # Add Q2, Q3, and Q4
    
    temp_4 = row
    
    temp_4['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q2','21':'22','22':'23','23':'24'})
    
    temp_5 = row
    
    temp_5['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q3','21':'22','22':'23','23':'24'})
    
    temp_6 = row
    
    temp_6['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q4','21':'22','22':'23','23':'24'})
    
    # Append to data frame
    
    my_df.append(temp_1)
    my_df.append(temp_2)
    my_df.append(temp_3)
    my_df.append(temp_4)
    my_df.append(temp_5)
    my_df.append(temp_6)
    
elif "Q1" in row["NewTenor"]:
    
    # Add Jan, Feb, and Mar
    
    temp_1 = row
    
    temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q1':'Jan'})
    
    temp_2 = row
    
    temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q1':'Feb'})
    
    temp_3 = row
    
    temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q1':'Mar'})
    
    # Append to data frame
    
    my_df.append(temp_1)
    my_df.append(temp_2)
    my_df.append(temp_3)
    
    
elif "Q2" in row["NewTenor"]:
    
    # Add Apr, May, and Jun
    
    temp_1 = row
    
    temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q2':'Apr'})
    
    temp_2 = row
    
    temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q2':'May'})
    
    temp_3 = row
    
    temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q2':'Jun'})
    
    
    # Append to data frame
    
    my_df.append(temp_1)
    my_df.append(temp_2)
    my_df.append(temp_3)
    
elif "Q3" in row["NewTenor"]:
    
    # Add Jul, Aug, and Sep
    
    temp_1 = row
    
    temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q3':'Jul'})
    
    temp_2 = row
    
    temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q3':'Aug'})
    
    temp_3 = row
    
    temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q3':'Sep'})
    
    # Append to data frame
    
    my_df.append(temp_1)
    my_df.append(temp_2)
    my_df.append(temp_3)
    
else :
    
    # Add Oct, Nov, and Dec
    
    temp_1 = row
    
    temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q4':'Oct'})
    
    temp_2 = row
    
    temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q4':'Nov'})
    
    temp_3 = row
    
    temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q4':'Dec'})
    
    # Append to data frame
    
    my_df.append(temp_1)
    my_df.append(temp_2)
    my_df.append(temp_3)
    
    
 my_df = pd.DataFrame(my_df)

這並不復雜，它總是給我錯誤。

有人可以幫我創建新的數據框嗎？ 先感謝您。

Answer 1

如果我理解正確：

def split_tenor(tenor):
    start, year = tenor.split(" ")
    if start == "Cal":
        months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
        year = int(year) + 1
    elif start == "Q1":
        months = ["Jan", "Feb", "Mar"]
    elif start == "Q2":
        months = ["Apr", "May", "Jun"]
    elif start == "Q3":
        months = ["Jul", "Aug", "Sep"]
    elif start == "Q4":
        months = ["Oct", "Nov", "Dec"]
    else:
        return tenor

    return [f"{m} {year}" for m in months]

df["new_tenor"] = df["new_tenor"].apply(split_tenor)
df.explode("new_tenor")

Pandas：根據另一個數據框的行創建數據框

問題描述

1 個解決方案

解決方案1
1 已采納 2022-05-24 18:57:48

Pandas：根據另一個數據框的行創建數據框

問題描述

1 個解決方案

解決方案1 1 已采納 2022-05-24 18:57:48

解決方案1
1 已采納 2022-05-24 18:57:48