簡體   English   中英

從R中的給定數據集中選擇最早的日期

[英]Selecting the earliest date from a given dataset in R

我有一個包含許多行的數據集,但我只選擇了一些如下所示,並且只需要選擇最早的SORT_DT,其余所有變量保持不變。

        CUST_NO ID_NO SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107   10120      1    2014-05-12             G/L  2015-01-09
2         107   10120      1    2014-05-12             G/L  2015-11-10
3         107   10120      1    2014-05-12             G/L  2014-06-18
4         107   10120      1    2014-05-12             G/L  2014-05-12
5         107   10120      1    2014-05-12             G/L  2015-07-10
6         107   10120      1    2014-05-12             G/L  2015-10-09
7         107   10120      1    2014-05-12             G/L  2016-04-08
8         107   10120      1    2014-05-12             G/L  2016-01-08
9         107   10120      1    2014-05-12             G/L  2016-12-22
10        107   10120      1    2014-05-12             G/L  2017-01-13
11        107   10120      1    2014-05-12             G/L  2016-07-08
12        107   10120      1    2014-05-12             G/L  2017-04-14
13        107   10120      1    2014-05-12             G/L  2017-04-17
14        107   10120      1    2014-05-12             G/L  2016-08-31
15        107   10120      1    2014-05-12             G/L  2015-04-10
16        107   10120      1    2014-05-12             G/L  2016-12-22

我需要輸出的形式

      CUST_NO   ID_NO      SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107     10120      1    2014-05-12             G/L     2014-05-12

如果有人有解決方案,請告訴我。

我還添加了新的數據集

df <- fread("CUST_NO ID_NO SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
         107   10120      1    2014-05-12             G/L  2015-01-09
        107   10120      1    2014-05-12             G/L  2015-11-10
        107   10120      1    2014-05-12             G/L  2014-06-18
        107   10120      1    2014-05-12             G/L  2014-05-13
        107   10120      1    2014-05-12             G/L  2015-07-10
        107   10120      1    2014-05-12             G/L  2015-10-09
        107   10120      1    2014-05-12             G/L  2016-04-08
        107   10120      1    2014-05-12             G/L  2016-01-08
        107   10120      1    2014-05-12             G/L  2016-12-22
        107   10120      1    2014-05-12             G/L  2017-01-13
        107   10120      1    2014-05-12             G/L  2016-07-08
        108   10120      1    2014-05-12             G/L  2017-04-14
        108   10120      1    2014-05-12             G/L  2017-04-17
        108   10120      1    2014-05-12             G/L  2016-08-31
        108   10120      1    2014-05-12             G/L  2015-04-10
        108   10120      1    2014-05-12             G/L  2016-12-22")

輸出應該如下所示

  CUST_NO   ID_NO      SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107     10120      1    2014-05-12             G/L     2014-05-13
2         108     10120      1    2014-05-12             G/L     2015-04-10    

嘗試這個:

aggregate(SORT_DT~.,min,data=df)

輸出:

  CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE    SORT_DT
1     107 10120      1     2014-05-12        G/L 2014-05-13
2     108 10120      1     2014-05-12        G/L 2015-04-10

嘗試aggregate

res <- aggregate(SORT_DT ~ CUST_NO + ID_NO + SYMBOL + AUTO_CREATE_DT + CLASS_TYPE, data = df, FUN = min)
res
  CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE    SORT_DT
1     107 10120      1     2014-05-12        G/L 2014-05-13
2     108 10120      1     2014-05-12        G/L 2015-04-10

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM