如何根據R中的另一個列值為列分配值？

Question

我有一個數據幀

 df <- data.frame(structure(list(col1= c("A", "B", "C", "D", "A"), 
         col2= c(1, 1, 1, 1, 5), col3 = c(2L, 1L, 1L, 1L, 1L)),
         .Names = c("col1", "col2", "col3"), 
         row.names = c(NA, -5L), class = "data.frame"))

我想添加額外的列col4，其值基於col2。 col2中具有相同值的行在col4中也具有相同的值。

通過解決方法，我以下列方式生成結果。

x <- df[!duplicated(df$col2),]
x$col4 <- paste("newValue", seq(1:nrow(x)), sep="_")

df_new <- merge(x, df, by ="col2")

df_new <- df_new[,c("col2","col4", "col1.y", "col3.y")]

這有效，但我認為這樣做有更好的方法。 謝謝！

Answer 1

你可以從dplyr嘗試dense_rank() ：

library(dplyr)
df %>% 
    mutate(col4 = dense_rank(col2),
           col4_new = paste0("newValue_", col4))

這給出了與你想要的輸出非常相似的東西，但我不確定你到底想要什么。 如果要確保col2具有相同值的所有行在col4獲得相同的值，則只需arrange df然后使用dense_rank ：

df %>% 
    arrange(col2) %>% 
    mutate(col4 = dense_rank(col2),
           col4_new = paste0("newValue_", col4))

這適用於任意大小的data.frame 。

Answer 2

可能這有幫助

df$col4 <- paste0("newValue_", cumsum(!duplicated(df$col2)))
df$col4
#[1] "newValue_1" "newValue_1" "newValue_1" "newValue_1" "newValue_2"

或者我們使用match

with(df, paste0("newValue_", match(col2, unique(col2))))
#[1] "newValue_1" "newValue_1" "newValue_1" "newValue_1" "newValue_2"

或者它可以用factor完成

with(df, paste0("newValue_", as.integer(factor(col2, levels = unique(col2)))))

如何根據R中的另一個列值為列分配值？

問題描述

2 個解決方案

解決方案1
2 2016-08-12 10:08:49

解決方案2
1 已采納 2016-08-12 09:55:13

如何根據R中的另一個列值為列分配值？

問題描述

2 個解決方案

解決方案1 2 2016-08-12 10:08:49

解決方案2 1 已采納 2016-08-12 09:55:13

解決方案1
2 2016-08-12 10:08:49

解決方案2
1 已采納 2016-08-12 09:55:13