簡體   English   中英

在R中對分組數據幀進行過濾時應用規則?

[英]Apply rules when filtering on grouped dataframe in R?

給定以下數據框:

structure(list(press_id = c(1L, 1L, 1L, 1L, 1L), time_state = c("start_time", 
"end_time", "start_time", "end_time", "start_time"), time_state_val = c(164429106667745, 
164429180716697, 164429106667745, 164429180716697, 164429106667745
), timestamp = c(164429106667745, 164429106667745, 164429106667745, 
164429106667745, 164429108669078), acc_mag = c(10.4656808698978, 
10.4656808698978, 10.4656808698978, 10.4656808698978, 10.458666511955
)), .Names = c("press_id", "time_state", "time_state_val", "timestamp", 
"acc_mag"), row.names = c(NA, -5L), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), vars = "press_id", drop = TRUE, indices = list(
    0:4), group_sizes = 5L, biggest_group_size = 5L, labels = structure(list(
    press_id = 1L), row.names = c(NA, -1L), class = "data.frame", vars = "press_id", drop = TRUE, .Names = "press_id"))

我想在過濾時應用“規則”:如果time_state == "start_time"然后檢查time_state_interval == min(timestamp) ,如果它是"end_time"檢查是否等於max(timestamp)

如何執行這種基於規則的filter 我正在嘗試用case_when但是沒有產生預期的結果。

  df1 %>% 
  group_by(press_id) %>% 
  mutate(row = row_number(),
         start_time = min(timestamp),
         end_time = max(timestamp)) %>% 
  gather(time_state , time_state_val, -press_id, -row,-timestamp:-vel_ang_mag_avg) %>%
  arrange(press_id, row) %>% 
  select(press_id, time_state, time_state_val, timestamp, acc_mag, vel_ang_mag, -row) %>%
  group_by(press_id, time_state) %>%
  filter(timestamp == case_when(time_state == "start_time" ~ min(timestamp),
                       time_state == "end_time" ~ max(timestamp)))

這是您要記住的嗎?

df1 %>%
  filter((time_state == "start_time" & timestamp == min(timestamp)) | 
         (time_state == "end_time" & timestamp == max(timestamp)))
#   press_id time_state time_state_val timestamp acc_mag
#      <int> <chr>               <dbl>     <dbl>   <dbl>
# 1        1 start_time        1.64e14   1.64e14    10.5
# 2        1 start_time        1.64e14   1.64e14    10.5

嘗試

data %>% group_by(press_id, time_state) %>% 
         mutate(start_flag=ifelse(time_state=='start_time' & timestamp==min(timestamp),1,0),
             end_flag=ifelse(time_state=='end_time' & timestamp==max(timestamp),1,0)) %>% 
         filter(start_flag==1 | end_flag==1)


# A tibble: 4 x 7
# Groups:   press_id, time_state [2]
  press_id time_state time_state_val timestamp acc_mag start_flag end_flag
     <int> <chr>               <dbl>     <dbl>   <dbl>      <dbl>    <dbl>
1        1 start_time        1.64e14   1.64e14    10.5          1        0
2        1 end_time          1.64e14   1.64e14    10.5          0        1
3        1 start_time        1.64e14   1.64e14    10.5          1        0
4        1 end_time          1.64e14   1.64e14    10.5          0        1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM