简体   繁体   中英

Clean column data in R

I am stuck with a small problem while trying to clean my data. I have a property data set which has listing of property size as character. Most of the data for the size column is just numeric but in text format. If I convert them to numeric data, I will loose many data due to coercion. There are particular data where it is like (20*40)....which I am unable to convert. As this data is getting coerced to NA while conversion. Any guidance on how to handle this kind of issue?

Maybe this function can be of help.

evalCell <- function(x){
  f <- function(x) eval(parse(text = x))
  sapply(x, f)
}

x <- c("(20*40)", 123, "1 + 2*3", "(1 + 2)*3")
evalCell(x)
#  (20*40)       123   1 + 2*3 (1 + 2)*3 
#      800       123         7         9 

If the return vector's names are not wanted, have the function return unname(sapply(etc)) .

We can use map with parse_expr

library(purrr)
map_dbl(x, ~ eval(rlang::parse_expr(.x)))
#[1] 800 123   7   9

data

x <- c("(20*40)", 123, "1 + 2*3", "(1 + 2)*3")

You can try

sapply(yourColumn, function(X) eval(parse(text=X)))

Example:

> sapply(c("5+5", "22", "10*5"), function(X) eval(parse(text=X)) )
 5+5   22 10*5 
  10   22   50 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM