簡體   English   中英

如何計算每組的序列?

[英]How to count sequence per group?

我有一個包含三列( MATDATASTATUS )的數據庫,我想計算各種 MAT 的 STATUS 序列。

我的數據集

MAT.     DATA       STATUS
1  4857 2021-09-21     FO
2  4857 2021-09-22     FO
3  4857 2021-09-23     FO
4  4857 2021-09-24     FO
5  4857 2021-09-25     FO
6  4857 2021-09-26     FO
7  4857 2021-09-27      T
8  4857 2021-09-28      C
9  4857 2021-09-29      C

示例:對於 MAT。 = 5109 我有以下序列

    MAT.       DATA STATUS
955 5109 2021-09-21      C
956 5109 2021-09-22      C
957 5109 2021-09-23      C
958 5109 2021-09-24      C
959 5109 2021-09-25      C
960 5109 2021-09-26      C
961 5109 2021-09-27      C
962 5109 2021-09-28      C
963 5109 2021-09-29      C
964 5109 2021-09-30      C
965 5109 2021-10-01      C
966 5109 2021-10-02      C
967 5109 2021-10-03      C
968 5109 2021-10-04      C
969 5109 2021-10-05      C
970 5109 2021-10-06      C
971 5109 2021-10-07      C
972 5109 2021-10-08      T
973 5109 2021-10-09     FO
974 5109 2021-10-10     FO
975 5109 2021-10-11     FO
976 5109 2021-10-12     FO
977 5109 2021-10-13     FO
978 5109 2021-10-14     FO
979 5109 2021-10-15     FO
980 5109 2021-10-16     FO
981 5109 2021-10-17     FO
982 5109 2021-10-18      T
983 5109 2021-10-19      C
984 5109 2021-10-20      C

在此處輸入圖像描述

我使用了生成計數的 rle rle

> rle(df_exemplo$STATUS)
Run Length Encoding
  lengths: int [1:5] 17 1 9 1 2
  values : chr [1:5] "C" "T" "FO" "T" "C"

現在我需要對所有其他 MAT 執行此操作。 並且該數據存儲在數據框中。

有誰知道我該怎么做?

所有數據集

df_group_MAT = 
structure(list(MAT. = c(4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 
                        4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 
                        4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 
                        4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 4885L, 5109L, 5109L, 
                        5109L, 5109L, 5109L, 5109L, 5109L, 5693L, 5693L, 5693L, 5693L, 
                        5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 
                        5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 
                        5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 5693L, 
                        5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 
                        5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 
                        5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 5986L, 
                        5986L, 5986L, 5986L, 5986L), DATA = structure(c(18892, 18896, 
                                                                        18900, 18904, 18908, 18912, 18916, 18920, 18924, 18928, 18932, 
                                                                        18936, 18940, 18944, 18948, 18952, 18956, 18960, 18964, 18968, 
                                                                        18972, 18976, 18980, 18984, 18988, 18992, 18996, 19000, 19004, 
                                                                        19008, 19012, 18893, 18897, 18901, 18905, 18909, 18913, 18917, 
                                                                        18891, 18895, 18899, 18903, 18907, 18911, 18915, 18919, 18923, 
                                                                        18927, 18931, 18935, 18939, 18943, 18947, 18951, 18955, 18959, 
                                                                        18963, 18967, 18971, 18975, 18979, 18983, 18987, 18991, 18995, 
                                                                        18999, 19003, 19007, 19011, 18892, 18896, 18900, 18904, 18908, 
                                                                        18912, 18916, 18920, 18924, 18928, 18932, 18936, 18940, 18944, 
                                                                        18948, 18952, 18956, 18960, 18964, 18968, 18972, 18976, 18980, 
                                                                        18984, 18988, 18992, 18996, 19000, 19004, 19008, 19012), class = "Date"), 
               STATUS = c("FO", "FO", "ANG", "ANG", "ANG", "ANG", "FO", 
                          "FO", "FO", "C", "C", "C", "C", "C", "EC", "EC", "EC", "FO", 
                          "FO", "FO", "C", "C", "C", "C", "FO", "FO", "FO", "C", "C", 
                          "C", "C", "C", "C", "C", "C", "FO", "FO", "FO", "C", "FO", 
                          "FO", "FO", "FO", "C", "C", "C", "C", "FO", "FO", "FO", "FO", 
                          "C", "AT", "C", "C", "C", "FO", "FO", "FO", "FO", "FO", "C", 
                          "C", "C", "C", "T", "FO", "FO", "FO", "C", "C", "C", "C", 
                          "T", "FO", "FO", "QR", "QR", "QR", "AG", "AG", "AG", "AG", 
                          "AG", "AG", "AG", "AG", "AG", "AG", "AG", "AG", "AG", "AG", 
                          "AG", "AG", "AG", "AG", "AG", "AG", "AG")), row.names = c(560L, 
                                                                                    564L, 568L, 572L, 576L, 580L, 584L, 588L, 592L, 596L, 600L, 604L, 
                                                                                    608L, 612L, 616L, 620L, 624L, 628L, 632L, 636L, 640L, 644L, 648L, 
                                                                                    652L, 656L, 660L, 664L, 668L, 672L, 676L, 680L, 957L, 961L, 965L, 
                                                                                    969L, 973L, 977L, 981L, 1842L, 1846L, 1850L, 1854L, 1858L, 1862L, 
                                                                                    1866L, 1870L, 1874L, 1878L, 1882L, 1886L, 1890L, 1894L, 1898L, 
                                                                                    1902L, 1906L, 1910L, 1914L, 1918L, 1922L, 1926L, 1930L, 1934L, 
                                                                                    1938L, 1942L, 1946L, 1950L, 1954L, 1958L, 1962L, 2827L, 2831L, 
                                                                                    2835L, 2839L, 2843L, 2847L, 2851L, 2855L, 2859L, 2863L, 2867L, 
                                                                                    2871L, 2875L, 2879L, 2883L, 2887L, 2891L, 2895L, 2899L, 2903L, 
                                                                                    2907L, 2911L, 2915L, 2919L, 2923L, 2927L, 2931L, 2935L, 2939L, 
                                                                                    2943L, 2947L), class = "data.frame")

你可以使用這個:

my_df2 <- df_group_MAT %>% group_by(MAT., STATUS) %>% summarise(Number = n())

這會給你預期的結果嗎?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM