在 R 中查找我的数据集中的最小和最大日期

Question

I have a dataset with Character and Date variables.我有一个包含字符和日期变量的数据集。 I would like to find the smallest and largest date in my dataset.我想在我的数据集中找到最小和最大的日期。

I am trying to use the pmin function but this does not seem to be working.我正在尝试使用 pmin 功能，但这似乎不起作用。 Once the max and min date have been extracted, I want to create a dataset with a sequence of dates between them.一旦提取了最大和最小日期，我想创建一个数据集，它们之间有一系列日期。 For example, if the oldest date is 2021-02-01 (from new column) and most recent is 2022-06-20 (from old column) I want to create a list of dates between the two.例如，如果最早的日期是 2021-02-01（来自新列），最近的日期是 2022-06-20（来自旧列），我想创建两者之间的日期列表。

Table:桌子：

ID ID	Old老的	New新的	Tier等级
001 001	NA不适用	2021-02-01 2021-02-01	A一个
002 002	NA不适用	2021-02-01 2021-02-01	A一个
003 003	NA不适用	2021-02-21 2021-02-21	A一个
004 004	NA不适用	2021-04-21 2021-04-21	A一个
005 005	NA不适用	2021-04-21 2021-04-21	A一个
006 006	NA不适用	2021-04-21 2021-04-21	A一个
006 006	2022-06-20 2022-06-20	2021-04-21 2021-04-21	B乙
002 002	2021-08-10 2021-08-10	2021-04-21 2021-04-21	B乙
003 003	2022-06-20 2022-06-20	2021-05-01 2021-05-01	B乙
003 003	2022-06-20 2022-06-20	2021-05-01 2021-05-01	B乙
003 003	2021-08-10 2021-08-10	2021-05-21 2021-05-21	B乙
003 003	2021-08-10 2021-08-10	2021-07-21 2021-07-21	B乙

Format variables in extended data: using str()格式化扩展数据中的变量：使用 str()

$ Old : Date, format: "2021-04-30" $ Id : chr $ New : Date, format: "2021-02-03" "2021-02-03" $ New1 : Date, format: NA NA NA NA ... $ New2 : Date, format: "2021-01-10" "2021-01-10" $ New3 : Date, format: NA NA "2021-06-10" NA ... $ New4 : Date, format: NA NA NA NA ... $ New5 : Date, format: NA NA "2022-07-10" NA ... $ 旧：日期，格式：“2021-04-30” $ Id：字符 $ 新：日期，格式：“2021-02-03”“2021-02-03” $ New1：日期，格式：NA NA NA NA ... $ New2 ：日期，格式：“2021-01-10” “2021-01-10” $ New3 ：日期，格式：NA NA “2021-06-10” NA ... $ New4 ：日期，格式: NA NA NA NA ... $ New5 : 日期, 格式: NA NA "2022-07-10" NA ...

Answer 1

In base R you can get the date range like this:在 base R 中，您可以获得这样的日期范围：

range(unlist(df[sapply(df, class) == "Date"]), na.rm = TRUE) |>
  as.Date(origin = "1970-01-01")
#> [1] "2021-02-01" "2022-06-20"

Explanation解释

To work with just columns of class "Date" in your data frame, you can do df[sapply(df, class) == "Date"] .要在数据框中仅使用“日期”类的列，您可以执行df[sapply(df, class) == "Date"] 。 If you unlist these columns, they form a single vector from which you can get the range (ie min / max), being sure you exclude NA values.如果您unlist这些列，它们会形成一个向量，您可以从中获取range （即最小值/最大值），并确保排除NA值。

Unfortunately, these steps remove the class attribute from the vector, so you need to convert it back to a date.不幸的是，这些步骤从向量中删除了类属性，因此您需要将其转换回日期。

Answer 2

Same basic idea as Allan but an approach that preserves the class:与艾伦相同的基本思想，但保留类的方法：

do.call(range, Filter(\(x) inherits(x, "Date"), dat))

[1] "2022-06-22" "2022-08-07"

Data:数据：

dat <- data.frame(a = Sys.Date() + sample(50, 5),
                  b = letters[1:5],
                  c = Sys.Date() + sample(50, 5),
                  d = runif(5))

在 R 中查找我的数据集中的最小和最大日期

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-06-17 22:46:02

解决方案2
2 2022-06-17 23:04:00

在 R 中查找我的数据集中的最小和最大日期

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-06-17 22:46:02

解决方案2 2 2022-06-17 23:04:00

解决方案1
2 已采纳 2022-06-17 22:46:02

解决方案2
2 2022-06-17 23:04:00