![](/img/trans.png)
[英]Normalize column in pandas dataframe by sum of grouped values of another column
[英]Sum DataFrame row values grouped by column in another DataFrame
我有兩個 DataFrame,如下所示:
df2 = pd.DataFrame({
'Code':['ABC','DEF','GHI','JKL','MNO'],
'2000': [19647.0, 1814135.0, 1864791.0,261630.0, 20758.0],
'2001': [1762621.0,1814135.0,1864791.0,1914573.0,1965598.0],
'2002': [25998340.0,26920466.0,207633.0,28813463.0,29784193.0] })
df2.set_index('Code')
df3 = pd.DataFrame({
'Code':['ABC','DEF','GHI','JKL','MNO'],
'Groups': ['Group A', 'Group B', 'Group C','Group B', 'Group A']})
df3.set_index('Code')
我需要按各個組獲取每年的總值。 例如。 A 組2000 年的總和是 40405.0。
對於石斑魚map
df2
到“組”的索引,然后取總和。 此外,您設置了從未將其分配回來的索引,因此您應該執行df2=df2.set_index('Code')
,盡管不必將它們作為解決此問題的索引。
#df2=df2.set_index('Code')
#df3=df3.set_index('Code')
df2.groupby(df2.index.map(df3['Groups'])).sum()
# 2000 2001 2002
#Code
#Group A 40405.0 3728219.0 55782533.0
#Group B 2075765.0 3728708.0 55733929.0
#Group C 1864791.0 1864791.0 207633.0
我會嘗試這樣的事情:
df2 = pd.DataFrame({
'Code':['ABC','DEF','GHI','JKL','MNO'],
'2000': [19647.0, 1814135.0, 1864791.0,261630.0, 20758.0],
'2001': [1762621.0,1814135.0,1864791.0,1914573.0,1965598.0],
'2002': [25998340.0,26920466.0,207633.0,28813463.0,29784193.0] })
df3 = pd.DataFrame({
'Code':['ABC','DEF','GHI','JKL','MNO'],
'Groups': ['Group A', 'Group B', 'Group C','Group B', 'Group A']})
df3 = df3.merge(df2, on=['Code'])
df3.groupby(['Groups']).sum()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.