如何根据包含 _________.000 小数的“版本”列过滤数据集?

[英]how do I filter dataset based on "Version" column containing _________.000 decimal?

I have a dataset where I am trying to filter based on 3 different columns.我有一个数据集,我试图根据 3 个不同的列进行过滤。

I have the 2 columns that have character values figured out by doing: filter(TRANSACTION_TYPE,= "ABC", CUSTOMER_CODE == "123") however.我有 2 列的字符值是通过执行以下操作计算出来的: filter(TRANSACTION_TYPE,= "ABC", CUSTOMER_CODE == "123") 但是。 I have a "VERSION" column where there will be multiple versions for each customer which will then duplicate my $ amount.我有一个“版本”列,其中每个客户都有多个版本,然后将复制我的 $ 金额。 I want to filter on only the VERSION that contains ".000" as decimal since the.000 represents the final and most accurate version, For example.我只想过滤包含“.000”作为十进制的版本,因为 .000 代表最终和最准确的版本,例如。 VERSION can = 20220901.000 and 20220901.002 ( enter image description here ), 20220901.003, etc. However the numbers before the decimal will always change so I can't filter on it to equal this 20220901 as it will change by day. VERSION 可以 = 20220901.000 和 20220901.002(在此处输入图像描述)、20220901.003 等。但是小数点前的数字总是会变化,所以我无法对其进行过滤以使其等于 20220901,因为它每天都会变化。

I hope I was clear enough, thank you!我希望我足够清楚,谢谢!

Sample data:样本数据:

quux <- data.frame(VERS_chr = c("20220901.000","20220901.002","20220901.000","20220901.002"),
                   VERS_num = c(20220901.000,20220901.002,20220901.000,20220901.002))

If is.character(quux$VERSION) is true in your data, then如果is.character(quux$VERSION)在您的数据中为真,则

dplyr::filter(quux, grepl("\\.000$", VERS_chr))
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901


  • "\\.000$" matches the literal period . "\\.000$"匹配文字句点. (it needs to be escaped since it's a regex reserved symbol) followed by three literal zeroes 000 , at the end of string ( $ ). (它需要转义,因为它是一个正则表达式保留符号)后跟三个文字零000 ,在字符串 ( $ ) 的末尾。 See https://stackoverflow.com/a/22944075/3358272 for more info on regex.有关正则表达式的更多信息,请参阅https://stackoverflow.com/a/22944075/3358272

If it is false (and it is not a factor ), then如果它是假的(并且它不是一个factor ),那么

dplyr::filter(quux, abs(VERS_num %% 1) < 1e-3)
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901



