计算 r 中的值的数量，后跟另一个值

Question

df <- data.frame(Year = c("May","June","July"), 
                 D1 = c(0,0,0), 
                 D2 = c(0,0,0), 
                 D3 = c(0,0,0), 
                 D4 = c(0,0,1), 
                 D5 = c(0,1,1),
                 D6 = c(0,1,1),
                 D7 = c(0,0,0),
                 D8 = c(0,0,0),
                 D9 = c(0,0,0),
                 D10 = c(0,0,0),
                 D11 = c(0,0,0),
                 D12 = c(0,0,0),
                 D13 = c(0,0,0), 
                 D14 = c(1,1,0), 
                 D15 = c(1,0,0),
                 D16 = c(0,1,0),
                 D17 = c(1,1,1),
                 D18 = c(0,0,0),
                 D19 = c(0,0,0),
                 D20 = c(0,0,0),
                 D21 = c(0,1,0),
                 D22 = c(0,0,0),
                 D23 = c(0,1,0), 
                 D24 = c(0,0,0), 
                 D25 = c(0,0,0),
                 D26 = c(1,0,0),
                 D27 = c(0,0,0),
                 D28 = c(1,0,1),
                 D29 = c(1,0,0),
                 D30 = c(1,1,0),
                 D31 = c(0,1,1)
                 )

I have a data frame (subset above) of months and days.我有几个月和几天的数据框（上面的子集）。 I am trying to count the number of days a 1 is followed by a 0, 0 is followed by a 1, etc. For example, May would have 2 ones followed by zeros and two zeros followed by 1s.我正在尝试计算 1 后跟 0、0 后跟 1 等的天数。例如，May 将有 2 个 1 后跟 0 和 2 个 0 后跟 1。 I am thinking a for loop would be the best way to go about this but am having trouble since the comparisons are in rows.我认为 for 循环将是 go 关于此问题的最佳方法，但由于比较是成行的，所以遇到了麻烦。

Answer 1

Based on the updated data, we may need rolling paste根据更新的数据，我们可能需要滚动paste

library(zoo)
out <- table(apply(df[-1], 1, function(x) rollapply(x, 2, paste, collapse="")))
out
#   00 01 10 11 
#   56 14 12  8 

sum(out)
#[1] 90

Or can be be made a bit more compact without the anonymous function call或者可以在没有匿名 function 调用的情况下变得更紧凑

table(apply(df[-1], 1, rollapply, width = 2, paste, collapse=""))

Or using tidyverse或使用tidyverse

library(runner)
library(janitor)
library(dplyr)
library(tidyr)

df %>% 
    rowwise %>%
    summarise(out = list(table(runner(c_across(starts_with('D')),
          f = function(x) paste(x, collapse=""), k = 2))), .groups = 'drop') %>%
    unnest_wider(c(out))  %>%
    adorn_totals() 
#     0 00 01 10 11
#     1 19  4  4  3
#     1 16  6  5  3
#     1 21  4  3  2
# Total 56 14 12  8

Answer 2

A base R option using gregexpr使用gregexpr的基本 R 选项

v <- do.call(paste0, df[-1])
rev(
  stack(
    sapply(
      c("00", "01", "10", "11"),
      function(x) sum(lengths(regmatches(v, gregexpr(x, v)))),
      USE.NAMES = TRUE
    )
  )
)

gives给

  ind values
1  00      1
2  01      5
3  10      4
4  11      1

Answer 3

As you already recognized the trouble is in the row comparison... then we can reshape the data from wide format to long format.正如您已经认识到问题在于行比较......然后我们可以将数据从宽格式重塑为长格式。

Warning: In your data, due to the wide format June have 31 days as May/July.

reshape_df <- df %>%
  tidyr::pivot_longer(cols = D1:D31, names_to = "date", values_to = "value") %>%
  mutate(index = if_else(value != lag(value), 1, 0)) %>%
  replace_na(list(index = 0)) %>%
  mutate(index_group = cumsum(index))

reshape_df %>%
  group_by(index_group) %>%
  summarize(first_month = first(Year),
            first_date = first(date),
            first_value = first(value),
            length = n())

Result for this data.此数据的结果。

   index_group first_month first_date first_value length pattern
         <dbl> <chr>       <chr>            <dbl>  <int> <chr>  
 1           0 May         D1                   0     13 0 -> 1 
 2           1 May         D14                  1      2 1 -> 0 
 3           2 May         D16                  0      1 0 -> 1 
 4           3 May         D17                  1      1 1 -> 0 
 5           4 May         D18                  0      8 0 -> 1 
 6           5 May         D26                  1      1 1 -> 0 
 7           6 May         D27                  0      1 0 -> 1 
 8           7 May         D28                  1      3 1 -> 0 
 9           8 May         D31                  0      5 0 -> 1 
10           9 June        D5                   1      2 1 -> 0 
# … with 18 more rows

计算 r 中的值的数量，后跟另一个值

问题描述

3 个解决方案

解决方案1
2 已采纳 2021-01-23 20:16:04

解决方案2
1 2021-01-23 20:36:31

解决方案3
0 2021-01-23 21:32:36

计算 r 中的值的数量，后跟另一个值

问题描述

3 个解决方案

解决方案1 2 已采纳 2021-01-23 20:16:04

解决方案2 1 2021-01-23 20:36:31

解决方案3 0 2021-01-23 21:32:36

解决方案1
2 已采纳 2021-01-23 20:16:04

解决方案2
1 2021-01-23 20:36:31

解决方案3
0 2021-01-23 21:32:36