how do I filter dataset based on "Version" column containing _________.000 decimal?

Question

I have a dataset where I am trying to filter based on 3 different columns.

I have the 2 columns that have character values figured out by doing: filter(TRANSACTION_TYPE,= "ABC", CUSTOMER_CODE == "123") however. I have a "VERSION" column where there will be multiple versions for each customer which will then duplicate my $ amount. I want to filter on only the VERSION that contains ".000" as decimal since the.000 represents the final and most accurate version, For example. VERSION can = 20220901.000 and 20220901.002 ( enter image description here ), 20220901.003, etc. However the numbers before the decimal will always change so I can't filter on it to equal this 20220901 as it will change by day.

I hope I was clear enough, thank you!

Answer 1

Sample data:

quux <- data.frame(VERS_chr = c("20220901.000","20220901.002","20220901.000","20220901.002"),
                   VERS_num = c(20220901.000,20220901.002,20220901.000,20220901.002))

If is.character(quux$VERSION) is true in your data, then

dplyr::filter(quux, grepl("\\.000$", VERS_chr))
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901

Explanation:

"\\.000$" matches the literal period . (it needs to be escaped since it's a regex reserved symbol) followed by three literal zeroes 000 , at the end of string ( $ ). See https://stackoverflow.com/a/22944075/3358272 for more info on regex.

If it is false (and it is not a factor ), then

dplyr::filter(quux, abs(VERS_num %% 1) < 1e-3)
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901

Explanation:

abs(.) < 1e-3 is defensive against high-precision tests of equality, where floating-point limitations (in computers in general) don't always see a number very-close to zero as exactly zero. See Why are these numbers not equal? , https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f .
. %% 1 . %% 1 is the modulus operator, reducing a number down to its fractional component.

how do I filter dataset based on "Version" column containing _________.000 decimal?

Question

1 answers

solution1
0 ACCPTED 2022-11-28 15:20:18

how do I filter dataset based on "Version" column containing _________.000 decimal?

Question

1 answers

solution1 0 ACCPTED 2022-11-28 15:20:18

solution1
0 ACCPTED 2022-11-28 15:20:18