用另一個索引值替換數據集值r

Question

我有以下數據集

我有這個數據集

 > head(names)
           V1
1    Greenock
2     Glasgow
3     Preston
4  Birmingham
5 Southampton
6          Le

現在我想要的很簡單：

head(data)

         from            to
1    Greenock     Glasgow
2    Glasgow      Preston
3    Glasgow      17 (you got the point)
4    Preston      Birmingham
5    Birmingham   Southampton
6    Birmingham   855

我嘗試了這種舊時尚，但

> for(i in 1:nrow(data)){
+ data$from[i] <- names$V1[data$from]
+ data$to[i] <- names$V1[data$to]
+ }

不好用
我知道這不是好事

有任何想法嗎？

Answer 1

R的factor是針對此類數據制定的。 它將數據保留為數字，但增加了人類可讀的level s。

我只是將from和to列轉換為factor s：

data$from <- factor(data$from)
data$to <- factor(data$to)

然后更改級別的標簽：

levels(data$from) <- names$V1
levels(data$to) <- names$V1

上面的代碼對我有用：

data <- data.frame(
 from = 1:10,
 to = seq(from=10, to=1, by=-1))

names <- data.frame(
  V1 = c('a','b','c','d','e', 'f','g','h','i','j'))

data$from <- factor(data$from)
data$to <- factor(data$to)

levels(data$from) <- names$V1
levels(data$to) <- names$V1

print(data)

結果是：

   from to
1     a  j
2     b  i
3     c  h
4     d  g
5     e  f
6     f  e
7     g  d
8     h  c
9     i  b
10    j  a

此答案的確假定您為每個數字都有一個標簽。 如果不是這種情況，則通常意味着數據有問題，並且您希望引發錯誤。 您應該使用stopifnot或（更好）來自Hadley的assertthat包的assert_that來斷言max(data[,c('to','from')]) <= nrow(names) （未經測試）。

如果您不想做此假設，則應使用@RichardScriven的答案。

Answer 2

這是使用某些邏輯子集和replace()的一種方法。

dlg <- data <= nrow(names)
replace(data, dlg, as.character(names$V1)[unlist(data)][dlg])
#         from          to
# 1   Greenock     Glasgow
# 2    Glasgow     Preston
# 3    Glasgow          17
# 4    Preston  Birmingham
# 5 Birmingham Southampton
# 6 Birmingham         855

順便說一句， data和names都是重要基礎函數的名稱，因此您可能需要重命名數據集。

用另一個索引值替換數據集值r

問題描述

2 個解決方案

解決方案1
2 已采納 2016-05-23 20:55:37

解決方案2
1 2016-05-23 20:43:30

用另一個索引值替換數據集值r

問題描述

2 個解決方案

解決方案1 2 已采納 2016-05-23 20:55:37

解決方案2 1 2016-05-23 20:43:30

解決方案1
2 已采納 2016-05-23 20:55:37

解決方案2
1 2016-05-23 20:43:30