简体   繁体   English

在 R 的文本列中用字符替换数字

[英]Replacing numbers with characters in a text column in R

I would like to replace some numbers in the text column of my data.我想替换数据文本列中的一些数字。 The numbers are either 8 or 9 digits and in two formats .这些号码是8 or 9 digits ,有two formats This is snapshot of the data:这是数据的快照:

df <- data.frame(
  notes = c(
    'my number is 123-41-567',
    "321 12 788 is valid",
    'why not taking 987-012-678',
    '120 967 325 is correct'
  )
)

df %>% select(notes)

                       notes
1    my number is 123-41-567
2        321 12 788 is valid
3 why not taking 987-012-678
4     120 967 325 is correct

I need to replace them all with a term such as aaaaa .我需要将它们全部替换为aaaaa之类的术语。 Hence, the data should look like:因此,数据应如下所示:

           notes
1     my number is aaaaa
2        aaaaa is valid
3   why not taking aaaaa
4     aaaaa is correct

Thank you in advance!先感谢您!

Assuming the examples really do cover all possible cases (I would be careful).假设示例确实涵盖了所有可能的情况(我会小心)。 You can do this with the following regular expression:您可以使用以下正则表达式执行此操作:

\\d{3}( |-)\\d{2,3}( |-)\\d{3}

Here's the code for replacing:这是替换代码:

library(dplyr)
library(stringr)

df %>% 
    mutate(
        notes = str_replace_all(notes, '\\d{3}( |-)\\d{2,3}( |-)\\d{3}', 'XXXXXX')
    )

                  notes
1   my number is XXXXXX
2       XXXXXX is valid
3 why not taking XXXXXX
4     XXXXXX is correct

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM