簡體   English   中英

當頻率小於3時,如何進行逐列計數和更改值?

[英]How can I do column wise counts and change value when the frequency is less than 3?

我有一個日期框架,有很多行,有一些低頻值。 我需要進行逐列計數,然后在頻率小於3時更改值。

DF-輸入

Col1     Col2     Col3       Col4
 1        apple    tomato     apple
 1        apple    potato     nan
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        grape    tomato     banana
 1        pear     tomato     banana
 1        lemon    tomato     burger

DF-輸出

Col1     Col2     Col3       Col4
 1        apple    tomato     Other
 1        apple    Other      nan
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        apple    tomato     banana
 1        Other    tomato     banana
 1        Other    tomato     banana
 1        Other    tomato     Other

where value_counts where使用:

df.where(df.apply(lambda x: x.groupby(x).transform('count')>2), 'Other')

輸出:

       Col2    Col3    Col4
Col1                       
1     apple  tomato   Other
1     apple   Other  banana
1     apple  tomato  banana
1     apple  tomato  banana
1     apple  tomato  banana
1     apple  tomato  banana
1     Other  tomato  banana
1     Other  tomato  banana
1     Other  tomato   Other

更新:在原始數據框中處理NaN:

d = df.apply(lambda x: x.groupby(x).transform('count'))
df.where(d.gt(2.0).where(d.notnull()).astype(bool), 'Other')

輸出:

       Col2    Col3    Col4
Col1                       
1     apple  tomato   Other
1     apple   Other     NaN
1     apple  tomato  banana
1     apple  tomato  banana
1     apple  tomato  banana
1     apple  tomato  banana
1     Other  tomato  banana
1     Other  tomato  banana
1     Other  tomato   Other

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM