简体   繁体   English

重命名 R 中列中的所有值

[英]Rename all the values within a column in R

I have a dataset, df, with the following values:我有一个数据集 df,具有以下值:

ID                                        Duration   
abcdefghijklmnopqrstuvwxyz                1 sec
abcdefghijklmnopqrstuvwxyz1               0 sec
abcdefghijklmnopqrstuvwxyz2               0 sec                    
abcdefghijklmnopqrstuvwxyz3               1 sec
abcdefghijklmnopqrstuvwxyz4               0 sec

Goal: I am plotting a histogram, and the values are just too lengthy.目标:我正在绘制直方图,但值太长了。 I would like to convert the values within the column ID to a shorter value such as:我想将列 ID 中的值转换为较短的值,例如:

ID                                        Duration   
A                                         1 sec
B                                         0 sec
C                                         0 sec                    
D                                         1 sec
E                                         0 sec

To do this, would I have to specify and write out every value within the row?为此,我是否必须指定并写出行中的每个值? (there are 100's of them) (有 100 个)

rename.values(df, abcdefghijklmnopqrstuvwxyz="A")...

Without the use of dplyr , if you want to rename all values in your column ID to a shorter ID (and assuming that all of your IDs are different), you can write:在不使用dplyr ,如果要将列 ID 中的所有值重命名为较短的 ID(并假设所有 ID 都不同),则可以编写:

df$ID <- paste0("A",1:nrow(df))

Alternative: Using gsub替代方案:使用gsub

Alternatively, if you have a very long pattern that you wish to replace (such abcdef....), you can use gsub :或者,如果您希望替换很长的模式(例如 abcdef ....),您可以使用gsub

df$ID <- gsub("abcdefghijklmnopqrstuvwxyz","A",df$ID)

The advantage with gsub is that if you have an ID repeted multiple times, it will conserve this repetition because it will replace only the first part of the ID string.使用gsub的优点是,如果您多次重复一个 ID,它会保存这种重复,因为它只会替换 ID 字符串的第一部分。

Example例子

a <- paste0(letters[1:26], collapse = "")
df <- data.frame(ID = paste0(a,1:100),
                value = rnorm(100))

So, your df looks like:所以,你的df看起来像:

  ID      value
1 A1  2.6977546
2 A2  1.9434639
3 A3  0.4191808
4 A4 -0.1545246
5 A5  2.0112518
6 A6  0.5877203
...

Now, if you replace character strings of ID by the following command:现在,如果您使用以下命令替换ID的字符串:

df$ID <- paste0("A",1:100)

or with gsub :或使用gsub

df$ID <- gsub("abcdefghijklmnopqrstuvwxyz","A",df$ID)

And you get:你会得到:

  ID      value
1 A1  2.6977546
2 A2  1.9434639
3 A3  0.4191808
4 A4 -0.1545246
5 A5  2.0112518
6 A6  0.5877203
...

So, you conserved all of your columns and values in the same order and you just modify the ID column.因此,您以相同的顺序保存了所有列和值,只需修改 ID 列。

You could simply create a new ID column, which would solve your issue and also preserve your original IDs (this assumes no duplicate IDs).您可以简单地创建一个新的 ID 列,这将解决您的问题并保留您的原始 ID(这假设没有重复的 ID)。

df <- df %>%
   mutate(ID2 = 1:nrow(df)) %>%
   select(ID2, Duration)        #  OR       select(-ID) : deselects ID, keeps everything else.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM