Pandas：最佳每第n行減去一次

Question

我正在為 pandas 中的逐行減法的特殊情況編寫 function。

首先，用戶應該能夠通過正則表達式（即“_BL[0-9]+”）或常規索引（即每第 6 行）指定行
然后我們必須從它前面的行中減去每個匹配的行，但不能超過另一個匹配
[可選] 刪除選定的行
要匹配的列應由索引或 label 用戶定義

例如，如果：

樣品	變量1	變量1
某物	10	20
某物	20	30
某物	40	30
some_BL20_thing	100	100
某物	50	70
某物	90	100
some_BL10_thing	100	10

預期的 output 應該是：

樣品	變量1	變量1
某物	-90	-80
某物	-80	-70
某物	-60	-70
某物	-50	60
某物	-10	90

我當前的（不完整的）實現很大程度上依賴於循環：

 def subtract_blanks(data:pd.DataFrame, num_samples:int)->pd.DataFrame: ''' Accepts a data dataframe and a mod int and subtracts each blank from all mod preceding samples ''' expr = compile(r'(_BL[0-9]{1})') output = data.copy(deep = True) for idx,row in output.iterrows(): if search(expr,row['Sample']): for i in range(1,num_samples+1): output.iloc[idx-i,data_start:] = output.iloc[idx-i,6:]-row.iloc[6:] return output

有沒有更好的方法來做到這一點？ 這個實現看起來很丑陋。 我還考慮過可能將 DataFrame 拆分為卡盤並對其進行操作。

Answer 1

代碼

# Create boolean mask for matching rows # m = np.arange(len(df)) % 6 == 5 # for index match m = df['Samples'].str.contains(r'_BL\d+') # for regex match # mask the values and backfill to propagate the row # values corresponding to match in backward direction df['var1'] = df['var1'] - df['var1'].mask(~m).bfill() # Delete the matching rows df = df[~m].copy()

 Samples var1 var1 0 something -90.0 -80.0 1 something -80.0 -70.0 2 something -60.0 -70.0 4 something -50.0 60.0 5 something -10.0 90.0

注意：核心邏輯在code中指定，所以我將把 function 的實現留給 OP。

Pandas：最佳每第n行減去一次

問題描述

1 個解決方案

解決方案1
1 已采納 2022-07-04 10:18:06

代碼

Pandas：最佳每第n行減去一次

問題描述

1 個解決方案

解決方案1 1 已采納 2022-07-04 10:18:06

代碼

解決方案1
1 已采納 2022-07-04 10:18:06