简体   繁体   中英

Grep based on decimal places in two columns

I've got a dataset that looks like this:

x        y 
112.21   234.511
56.22    1.1111
3.456    2.31 
1.1      2.4567 
3.411    4.5

I want to subset the rows of this dataset by values of x and y which have 2 or more decimal places. So the end result will be this:

x        y 
112.21   234.511
56.22    1.1111
3.456    2.31 

# two last rows are removed as they have values with less than 2 decimal places 

I tried doing something of this sort, but it doesn't work properly (it keeps some 1 decimal place values):

edited_df <- df[grep("\\.[1-9][1-9]", df$x) && grep("\\.[1-9][1-9]", df$y)] 

How can I do this?

Using nchar on the "suffix" after the point, which you get using gsub .

d[rowSums(nchar(sapply(d, gsub, pa="^.*\\.", re="")) > 1) > 1, ]
#         x        y
# 1 112.210 234.5110
# 2  56.220   1.1111
# 3   3.456   2.3100

If gsub was Vectorized like so:

g <- Vectorize(gsub)

we could do the approach slightly more succinct:

d[rowSums(nchar(g(pa="^.*\\.", re="", d)) > 1) > 1, ]
#         x        y
# 1 112.210 234.5110
# 2  56.220   1.1111
# 3   3.456   2.3100

Data:

d <- structure(list(x = c(112.21, 56.22, 3.456, 1.1, 3.411), y = c(234.511, 
1.1111, 2.31, 2.4567, 4.5)), class = "data.frame", row.names = c(NA, 
-5L))

Assuming that d is as in the Note the end, create a function decimals that returns TRUE for each element of its vector argument that has 2+ decimals (or FALSE otherwise) or if given a data frame argument applies that to each column. Use it to subset d .

decimals <- function(x) sapply(x, grepl, pattern = r"{\.\d\d}")

subset(d, decimals(x) & decimals(y))
##         x        y
## 1 112.210 234.5110
## 2  56.220   1.1111
## 3   3.456   2.3100

or if there can be an unknown number of numeric columns in d or different column names then replace the last line with:

subset(d, apply(decimals(d), 1, all))

Note

Lines <- "
x        y 
112.21   234.511
56.22    1.1111
3.456    2.31 
1.1      2.4567 
3.411    4.5"
d <- read.table(text = Lines, header = TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM