簡體   English   中英

R 對所有未出現多次的觀察值重新編碼變量

[英]R Recode variable for all observations that do not occur more than once

我有一個簡單的 dataframe,如下所示:

Observation X1 X2 Group
1           2   4   1
2           6   3   2
3           8   4   2
4           1   3   3
5           2   8   4
6           7   5   5
7           2   4   5

如何重新編碼group變量,以便將所有非經常性觀察結果重新編碼為“獨立”?

所需的 output 如下所示:

Observation X1 X2 Group
1           2   4   Unaffiliated
2           6   3   2
3           8   4   2
4           1   3   Unaffiliated
5           2   8   Unaffiliated
6           7   5   5
7           2   4   5

我們可以使用duplicated為非重復項創建邏輯向量,並為那些非重復項將“Group”分配給Unaffiliated

df1$Group[with(df1, !(duplicated(Group)|duplicated(Group, 
     fromLast = TRUE)))] <- "Unaffiliated"

-輸出

> df1
  Observation X1 X2        Group
1           1  2  4 Unaffiliated
2           2  6  3            2
3           3  8  4            2
4           4  1  3 Unaffiliated
5           5  2  8 Unaffiliated
6           6  7  5            5
7           7  2  4            5

數據

df1 <- structure(list(Observation = 1:7, X1 = c(2L, 6L, 8L, 1L, 2L, 
7L, 2L), X2 = c(4L, 3L, 4L, 3L, 8L, 5L, 4L), Group = c(1L, 2L, 
2L, 3L, 4L, 5L, 5L)), class = "data.frame", row.names = c(NA, 
-7L))

unfaffil接受一個組號向量,如果它有一個元素則返回"Unaffiliated" ,否則返回輸入。 然后我們可以使用ave按組應用它。 這不會覆蓋輸入。 沒有使用任何包,但如果您使用 dplyr ,則transform可以替換為mutate

unaffil <- function(x) if (length(x) == 1) "Unaffiliated" else x
transform(dat, Group = ave(Group, Group, FUN = unaffil))

給予

  Observation X1 X2        Group
1           1  2  4 Unaffiliated
2           2  6  3            2
3           3  8  4            2
4           4  1  3 Unaffiliated
5           5  2  8 Unaffiliated
6           6  7  5            5
7           7  2  4            5

筆記

dat <- structure(list(Observation = 1:7, X1 = c(2L, 6L, 8L, 1L, 2L, 
7L, 2L), X2 = c(4L, 3L, 4L, 3L, 8L, 5L, 4L), Group = c(1L, 2L, 
2L, 3L, 4L, 5L, 5L)), class = "data.frame", row.names = c(NA, 
-7L))

一種方法可能是首先分組,然后檢查行號的最大值並以ifelse

library(dplyr)

df %>% 
  group_by(Group) %>% 
  mutate(Group = ifelse(max(row_number()) == 1, "Unaffiliated", as.character(Group))) %>% 
  ungroup()
  Observation    X1    X2 Group       
        <int> <int> <int> <chr>       
1           1     2     4 Unaffiliated
2           2     6     3 2           
3           3     8     4 2           
4           4     1     3 Unaffiliated
5           5     2     8 Unaffiliated
6           6     7     5 5           
7           7     2     4 5    

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM