简体   繁体   中英

Finding and converting all numbers into their corresponding names in R

I have a single column data frame where each row is a statement. The statements are mostly alpha characters, but there are a few numeric characters. I am trying to locate all numeric characters and replace them with their corresponding alpha characters.

Basically, I want to go from this

 "I looked at the watermelons around 12 today"
 "There is a dog on the bench"
 "the year is 2017"
 "I am not hungry"
 "He turned 1 today"

into (or something similar to)

 "I looked at the watermelons around twelve today"
 "There is a dog on the bench"
 "the year is two thousand seventeen"
 "I am not hungry"
 "He turned one today"

There are functions I am familiar with that turn numbers into words, such as the numbers_to_words function from the xfun package, but I don't know how to do this systematically for the entire data frame.

Here's one approach with the stringr and english packages.

library(stringr)
library(english)
data<-  c("I looked at the watermelons around 12 today", "There is a dog on the bench", "the year is 2017", "I am not hungry", "He turned 1 today")
Replacement <-  lapply(str_extract_all(data,"[0-9]+"),function(x){
                   as.character(as.english(as.numeric(x)))})

sapply(seq_along(data),
       function(i){
         ifelse(grepl('[0-9]+',data[i]),
                str_replace_all(data[i],"[0-9]+",Replacement[[i]]),
                data[i])})
[1] "I looked at the watermelons around twelve today" "There is a dog on the bench"                    
[3] "the year is two thousand seventeen"              "I am not hungry"                                
[5] "He turned one today"  

Actually i dont know an easy function or something like this but i have a maybe little bit bad solution for you:

library(xfun)
a <- "I looked at the watermelons around 12 today"        
y <- numeric(nchar(a))        
for(i in 1:nchar(a))        
{        
  y[i]<-as.numeric(substr(a,i,i))        
}        
x <- n2w(as.numeric(paste(na.omit(y), collapse="")))        
z <- which(y != "NA")        
paste(c(substr(a, 1, z[1]-1), x, substr(a, z[length(z)] + 1, nchar(a))), collapse = "")

and at the moment it only works for one number in one sentence

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM