簡體   English   中英

計算r中df中每個組的行之間頻率重復次數最多的差異

[英]Calculate the most freq repeated difference between rows for each group in a df in r

給定如下數據框:

Name No  Diff   Most repeated Diff
A   24      
A   35      
A   39      
A   41      
A   42      
A   43      
B   32      
B   35      
B   36      
B   37      
C   34      
C   40      
C   42      
D   34      
D   39      
D   44      
E   35      
E   36      

如何計算最后一列作為最頻繁重復的行差異? (例如,對於每個我想計算行的差異,然后查看哪個差異更重復 - 在這種情況下,A 將是 1,兩個差異等於 1)。

提前致謝。

我們可以使用diff來計算差異和table來計算它們的頻率

library(dplyr)

df %>%
  group_by(Name) %>%
  mutate(diff = c(NA, diff(No)), 
         #Can also use lag to get difference with previous value
         #diff = No - lag(No),
         most_repeated_diff = names(which.max(table(diff))))

#   Name     No  diff most_repeated_diff
#   <fct> <int> <int> <chr>             
# 1 A        24    NA 1                 
# 2 A        35    11 1                 
# 3 A        39     4 1                 
# 4 A        41     2 1                 
# 5 A        42     1 1                 
# 6 A        43     1 1                 
# 7 B        32    NA 1                 
# 8 B        35     3 1                 
# 9 B        36     1 1                 
#10 B        37     1 1                 
#11 C        34    NA 2                 
#12 C        40     6 2                 
#13 C        42     2 2                 
#14 D        34    NA 5                 
#15 D        39     5 5                 
#16 D        44     5 5                 
#17 E        35    NA 1                 
#18 E        36     1 1      

數據

df <- structure(list(Name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L), .Label = c("A", 
"B", "C", "D", "E"), class = "factor"), No = c(24L, 35L, 39L, 
41L, 42L, 43L, 32L, 35L, 36L, 37L, 34L, 40L, 42L, 34L, 39L, 44L, 
35L, 36L)), class = "data.frame", row.names = c(NA, -18L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM