簡體   English   中英

根據值何時更改而不使用 if 語句重寫 dataframe 中的列單元格值

[英]rewritng a column cell values in a dataframe based on when the value change without using if statment

我有一個包含錯誤值的列,因為它應該計算周期,但是數據來自的設備在 50 之后重置計數所以我只剩下 exmalple [1,1,1,1,2,2,2,,3 ,3,3,3,...,50,50,50,1,1,1,2,2,2,2,3,3,3,...,50,50,..... ,50] 我的解決方案是,我什至不能讓它工作:(為簡單起見,我從 10 個周期開始重置數據

 data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,1,1,1,2,3,3,3,3,
               4,4,5,6,6,6,7,8,8,8,8,9,10]}
df = pd.DataFrame(data)
x=0
count=0
old_value=df.at[x,'Cyc-Count']
for x in range(x,len(df)-1):
    if df.at[x,'Cyc-Count']==df.at[x+1,'Cyc-Count']:
        old_value=df.at[x+1,'Cyc-Count']
        df.at[x+1,'Cyc-Count']=count
       
    else:
        old_value=df.at[x+1,'Cyc-Count']
        count+=1
        df.at[x+1,'Cyc-Count']=count
    

我需要解決這個問題,但最好甚至不使用if語句,上面示例所需的 output 應該是

data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,11,11,11,12,13,13,13,13,
               14,14,15,16,16,16,17,18,18,18,18,19,20]}

提示”我的方法有一個大問題是最后一個索引值將很難更改,因為在將它與它的索引+1 進行比較時 > 它甚至不存在

IIUC,您想在計數器減少時繼續計數。

您可以使用矢量代碼:

s = df['Cyc-Count'].shift()
df['Cyc-Count2'] = (df['Cyc-Count']
                   + s.where(s.gt(df['Cyc-Count']))
                      .fillna(0, downcast='infer')
                      .cumsum()
                   )

或者,就地修改列:

s = df['Cyc-Count'].shift()
df['Cyc-Count'] +=  (s.where(s.gt(df['Cyc-Count']))
                      .fillna(0, downcast='infer').cumsum()
                     )

output:

    Cyc-Count  Cyc-Count2
0           1           1
1           1           1
2           1           1
3           1           1
4           2           2
5           2           2
6           2           2
7           3           3
8           3           3
9           3           3
10          3           3
11          4           4
12          5           5
13          5           5
14          5           5
15          1           6
16          1           6
17          1           6
18          2           7
19          2           7
20          2           7
21          2           7
22          3           8
23          3           8
24          3           8
25          4           9
26          5          10
27          5          10
28          1          11
29          2          12
30          2          12
31          3          13
32          4          14
33          5          15
34          5          15

使用的輸入:

l = [1,1,1,1,2,2,2,3,3,3,3,4,5,5,5,1,1,1,2,2,2,2,3,3,3,4,5,5,1,2,2,3,4,5,5]
df = pd.DataFrame({'Cyc-Count': l})

您可以使用df.loc按標簽或 boolean 數組訪問一組行和列。

syntax: df.loc[df['column name'] condition, 'column name or the new one'] = 'value if condition is met'

例如:

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,0,0]}
df = pd.DataFrame(numbers,columns=['set_of_numbers'])
print (df)

df.loc[df['set_of_numbers'] == 0, 'set_of_numbers'] = 999
df.loc[df['set_of_numbers'] == 5, 'set_of_numbers'] = 555

print (df)

之前:'set_of_numbers':[1,2,3,4,5,6,7,8,9,10,0,0]

之后:'set_of_numbers':[1,2,3,4,555,6,7,8,9,10,999,999]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM