[英]Using Pandas Dataframe Append in For Loop
我繼續使用pandas.dataframe.append,但是在for循環中追加繼續被覆蓋並且信息被復制。 for 循環的輸出應該與原始數據幀完全一樣(我不包括消除任何復雜性的函數)。 任何幫助,將不勝感激。
import pandas as pd
df = pd.DataFrame({
'date': ['2019-01-01','2019-01-01','2019-01-01',
'2019-02-01','2019-02-01','2019-02-01',
'2019-03-01','2019-03-01','2019-03-01',],
'Asset': ['Asset A', 'Asset A', 'Asset A', 'Asset B', 'Asset B', 'Asset B',
'Asset C', 'Asset C', 'Asset C'],
'Monthly Value': [2100, 8100, 1400, 1400, 3100, 1600, 2400, 2100, 2100]
})
print(df.sort_values(by=['Asset']))
date Asset Monthly Value
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
此 for 循環為 df 創建多個附屬物並復制行
assetlist = list(df['Asset'].unique())
for asset in assetlist:
df_subset = df[df['Asset'] == asset]
dfcopy = df_subset.copy()
newdf = newdf.append(dfcopy)
print(newdf)
此輸出不正確,它應該與原始數據幀完全一樣。
date Asset Monthly Value
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
我認為您缺少一行:
assetlist = list(df['Asset'].unique())
newdf = pd.DataFrame() # <-- define it as a data frame
for asset in assetlist:
df_subset = df[df['Asset'] == asset]
dfcopy = df_subset.copy()
newdf = newdf.append(dfcopy)
print(newdf)
date Asset Monthly Value
0 2019-01-01 Asset A 2100
1 2019-01-01 Asset A 8100
2 2019-01-01 Asset A 1400
3 2019-02-01 Asset B 1400
4 2019-02-01 Asset B 3100
5 2019-02-01 Asset B 1600
6 2019-03-01 Asset C 2400
7 2019-03-01 Asset C 2100
8 2019-03-01 Asset C 2100
但是,更簡單的方法是:
newdf = pd.concat([df.query("Asset == @asset") for asset in assetlist])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.