簡體   English   中英

如何使用 Groupby 運行自定義 function 並在 Pandas 中應用

[英]How to running a custom function with Groupby and Apply in Pandas

I am trying to run a custom function on a Pandas dataframe, so that I runs for each name and gives me output, then runs on a similar group of names. 但我被困住了,似乎無法弄清楚如何在這里完成。

INPUT:
          NAME   STEPS
    0   Andrew    PASS
    1   Andrew    PASS
    2   Andrew    PASS
    3   Sam       PASS
    4   Sam       PASS
def my_function(df):
    # consecutive passes and strikes
    consecutive_passes = 0
    consecutive_passes_list = []

    points = 0
    points_list = []

    running_count = 0
    running_count_list = []

    fails = 0

    for i in range(len(df)):
        if df.STEPS[i] == "PASS":
            consecutive_passes += 1
            if consecutive_passes >= 11:
                points = 2
                consecutive_passes_list.append(consecutive_passes)
                points_list.append(strikes)
    #             print("PASS", consecutive_passes, points)
            else:
                consecutive_passes_list.append(consecutive_passes)
                points_list.append(points)
    #             print("PASS", consecutive_passes, points)

        if df.STEPS[i] == "FAIL":
            consecutive_passes = 0
            fails += 1
            points -= 1
            if points == -1:
                points = 0
                consecutive_passes_list.append(consecutive_passes)
                points_list.append(points)
    #             print("FAIL", consecutive_passes, points)
            else:
                consecutive_passes_list.append(consecutive_passes)
                points_list.append(points)
    #             print("FAIL", consecutive_passes, points)
    
    df["CONSECUTIVE_PASSES"] = consecutive_passes_list
    df["POINTS"] = points_list
    
    # inspection rate
    inspection_rate = []
    for i in range(len(df)):
        if df.POINTS[i] == 0:
            inspection_rate.append(low_risk[df.CONSECUTIVE_PASSES[i]])
        if df.POINTS[i] == 2:
            ir = low_risk[df.CONSECUTIVE_PASSES[i]]
            inspection_rate.append(ir)
        if df.POINTS[i] == 1:
            inspection_rate.append(ir)
    
    df["INSPECTION_RATE"] = inspection_rate
    
    return df.tail()

我真的需要幫助。 弄清楚如何為每個名稱運行 function 並返回 dataframe 的最后一行。 如果有人可以幫助我越過終點線,那就太好了。 謝謝!!!'

ERROR UPDATE:

<ipython-input-74-41b5a1cc100d>:36: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["CONSECUTIVE_PASSES"] = consecutive_passes_list
<ipython-input-74-41b5a1cc100d>:37: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["STRIKES"] = strikes_list

假設你的 function 做了它應該做的事情,你可以像這樣運行它以獲得你想要的東西。

results = {}
for name in df.NAME.unique():
     results[name] = my_function(df[df["NAME"]==name])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM