如何根据包含 _________.000 小数的“版本”列过滤数据集？

Question

I have a dataset where I am trying to filter based on 3 different columns.我有一个数据集，我试图根据 3 个不同的列进行过滤。

I have the 2 columns that have character values figured out by doing: filter(TRANSACTION_TYPE,= "ABC", CUSTOMER_CODE == "123") however.我有 2 列的字符值是通过执行以下操作计算出来的： filter(TRANSACTION_TYPE,= "ABC", CUSTOMER_CODE == "123") 但是。 I have a "VERSION" column where there will be multiple versions for each customer which will then duplicate my $ amount.我有一个“版本”列，其中每个客户都有多个版本，然后将复制我的 $ 金额。 I want to filter on only the VERSION that contains ".000" as decimal since the.000 represents the final and most accurate version, For example.我只想过滤包含“.000”作为十进制的版本，因为 .000 代表最终和最准确的版本，例如。 VERSION can = 20220901.000 and 20220901.002 ( enter image description here ), 20220901.003, etc. However the numbers before the decimal will always change so I can't filter on it to equal this 20220901 as it will change by day. VERSION 可以 = 20220901.000 和 20220901.002（在此处输入图像描述）、20220901.003 等。但是小数点前的数字总是会变化，所以我无法对其进行过滤以使其等于 20220901，因为它每天都会变化。

I hope I was clear enough, thank you!我希望我足够清楚，谢谢！

Answer 1

Sample data:样本数据：

quux <- data.frame(VERS_chr = c("20220901.000","20220901.002","20220901.000","20220901.002"),
                   VERS_num = c(20220901.000,20220901.002,20220901.000,20220901.002))

If is.character(quux$VERSION) is true in your data, then如果is.character(quux$VERSION)在您的数据中为真，则

dplyr::filter(quux, grepl("\\.000$", VERS_chr))
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901

Explanation:解释：

"\\.000$" matches the literal period . "\\.000$"匹配文字句点. (it needs to be escaped since it's a regex reserved symbol) followed by three literal zeroes 000 , at the end of string ( $ ). （它需要转义，因为它是一个正则表达式保留符号）后跟三个文字零000 ，在字符串 ( $ ) 的末尾。 See https://stackoverflow.com/a/22944075/3358272 for more info on regex.有关正则表达式的更多信息，请参阅https://stackoverflow.com/a/22944075/3358272 。

If it is false (and it is not a factor ), then如果它是假的（并且它不是一个factor ），那么

dplyr::filter(quux, abs(VERS_num %% 1) < 1e-3)
#       VERS_chr VERS_num
# 1 20220901.000 20220901
# 2 20220901.000 20220901

Explanation:解释：

abs(.) < 1e-3 is defensive against high-precision tests of equality, where floating-point limitations (in computers in general) don't always see a number very-close to zero as exactly zero. abs(.) < 1e-3是针对高精度相等性测试的防御措施，其中浮点限制（在一般计算机中）并不总是将非常接近零的数字视为恰好为零。 See Why are these numbers not equal?请参阅为什么这些数字不相等？ , https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f . , https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f 。
. %% 1 . %% 1 is the modulus operator, reducing a number down to its fractional component. . %% 1是取模运算符，将数字减少到它的小数部分。

如何根据包含 _________.000 小数的“版本”列过滤数据集？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-11-28 15:20:18

如何根据包含 _________.000 小数的“版本”列过滤数据集？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-11-28 15:20:18

解决方案1
0 已采纳 2022-11-28 15:20:18