简体   繁体   English

将一列数值重新编码为 R 中的新文本值列

[英]Recode a column of numercial values to a new column of text values in R

In R, in a data frame, I want to take the code number of a tree species in one column and create a new column in the data frame with recoded text name of the species like below.在 R 中,在数据框中,我想获取一列中树种的代码编号,并在数据框中创建一个新列,并使用如下所示的物种的重新编码文本名称。 I can create a matrix of tree name = code number, but how do I apply this to a long and mixed column of just numerical values?我可以创建一个树名 = 代码号的矩阵,但是如何将其应用于仅包含数值的长列和混合列?

> treeco <- c(4, 3, 4, 5, 3, 2, 2, 1, 4)
> spcode <- c("oak" = 1, "ash" = 2, "elm" = 3, "beech" = 4, "hazel" = 5)
> treesp <- data.frame(spcode)
> treesp
      species
oak         1
ash         2
elm         3
beech       4
hazel       5

This is the solution I am looking for:这是我正在寻找的解决方案:

  treeco spcode
1      4  beech
2      3    elm
3      4  beech
4      5  hazel
5      3    elm
6      2    ash
7      2    ash
8      1    oak
9      4  beech

base R基数R

data.frame(treeco, answer = names(spcode)[treeco])
#   treeco answer
# 1      4  beech
# 2      3    elm
# 3      4  beech
# 4      5  hazel
# 5      3    elm
# 6      2    ash
# 7      2    ash
# 8      1    oak
# 9      4  beech

dplyr dplyr

It can be slightly confusing when a column-name matches one in the environment, so for the sake of demonstration I'll rename treeco in the tibble so that it is clear which is being used.当列名与环境中的列名匹配时,可能会有些混乱,因此为了演示起见,我将在treeco中重命名treeco ,以便清楚使用的是哪个。

library(dplyr)
tibble(tc = treeco) %>%
  mutate(answer = names(spcode)[tc])
# # A tibble: 9 x 2
#      tc answer
#   <dbl> <chr> 
# 1     4 beech 
# 2     3 elm   
# 3     4 beech 
# 4     5 hazel 
# 5     3 elm   
# 6     2 ash   
# 7     2 ash   
# 8     1 oak   
# 9     4 beech 

There's another method that allows you to bring in much more than one extra column: the join/merge.还有另一种方法可以让您引入不止一个额外的列:加入/合并。

treecodes <- data.frame(code = spcode, tree = names(spcode))
set.seed(42)
treecodes$rand <- sample(100, size = nrow(treecodes), replace = TRUE)
treecodes
#       code  tree rand
# oak      1   oak   49
# ash      2   ash   65
# elm      3   elm   25
# beech    4 beech   74
# hazel    5 hazel  100
trees <- data.frame(code = treeco)
trees
#   code
# 1    4
# 2    3
# 3    4
# 4    5
# 5    3
# 6    2
# 7    2
# 8    1
# 9    4
trees %>%
  left_join(treecodes, by = "code")
#   code  tree rand
# 1    4 beech   74
# 2    3   elm   25
# 3    4 beech   74
# 4    5 hazel  100
# 5    3   elm   25
# 6    2   ash   65
# 7    2   ash   65
# 8    1   oak   49
# 9    4 beech   74

For more information on joins/merges, see How to join (merge) data frames (inner, outer, left, right) and What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?有关连接/合并的更多信息,请参阅如何连接(合并)数据框(内部、外部、左、右)内部连接、左连接、右连接完全连接之间的区别是什么? . .

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM