簡體   English   中英

基於另一列減去某個熊貓數據框列的最小值

[英]Subtracting minimum values of a certain pandas dataframe column based on another column

我有一個巨大的pandas數據幀df ,通過分類id ,然后year

id        gender        year
3         male          1983
3         male          1983
3         male          1985
3         male          1990
6         female        1991
6         female        1992
7         male          1980
...
592873    female        1989
592873    female        1996
593001    male          2001
593428    female        2007
593428    female        2009

我的目標是創建另一列ca ,其計算方式為:

  • year - 該id最小year

因此, df的前六行應該返回:

id        gender        year        ca
3         male          1983        0
3         male          1983        0
3         male          1985        2
3         male          1990        7
6         female        1991        0
6         female        1992        1

(換句話說,我正在尋找這個問題的 Pythonic 答案。)


我能想到的一種解決方案是制作一個列表並使用for循環:

ca_list = []

for i in range(len(df)):
  if df['id'][i] != df['id'][i-1]:
    num = df['year'][i]
    ca_list.append(0)
  else:
    ca_list.append(df['year'][i] - num)

df['ca'] = ca_list

但我相信有一種更優化的方法來設計這個。 任何見解都非常感謝。

嘗試:

df["ca"] = df.groupby("id")["year"].transform(lambda x: x - x.min())
print(df)

印刷:

   id  gender  year  ca
0   3    male  1983   0
1   3    male  1983   0
2   3    male  1985   2
3   3    male  1990   7
4   6  female  1991   0
5   6  female  1992   1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM