繁体   English   中英

在tidyverse中动态添加多列

[英]adding multiple columns dynamically in tidyverse

我有一个包含多个日期时间列的表,我希望为每个列提取工作日并添加为新列。

示例数据集:

structure(list(mealTime = structure(c(1542492000, 1578852000, 
1604253600, 1545901200, 1549821600, 1544306400), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), weight_measure_time = structure(c(1542226000, 1578812400, 
1594710000, 1545896762, 1546416823, 1544227245), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), height_measure_time = structure(c(1542106434, 1543337043, 
1543337043, 1542387988, 1542366547, 1542802228), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), hba1c_measure_time = structure(c(1542106860, 1573455600, 
1594625400, 1544781600, 1545920520, 1544096580), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), bpMeasureTime = structure(c(1542380623, 1578812400, 
1583218800, 1545896774, 1546416837, 1544266110), tzone = "UTC", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

看起来像这样:

> smple
# A tibble: 6 x 5
  mealTime            weight_measure_time height_measure_time
  <dttm>              <dttm>              <dttm>             
1 2018-11-17 22:00:00 2018-11-14 20:06:40 2018-11-13 10:53:54
2 2020-01-12 18:00:00 2020-01-12 07:00:00 2018-11-27 16:44:03
3 2020-11-01 18:00:00 2020-07-14 07:00:00 2018-11-27 16:44:03
4 2018-12-27 09:00:00 2018-12-27 07:46:02 2018-11-16 17:06:28
5 2019-02-10 18:00:00 2019-01-02 08:13:43 2018-11-16 11:09:07
6 2018-12-08 22:00:00 2018-12-08 00:00:45 2018-11-21 12:10:28
# ... with 2 more variables: hba1c_measure_time <dttm>, bpMeasureTime <dttm>

对于上述数据集,我期望的预期结果是,即对于每个日期时间列提取工作日并将其添加到相应列中:

glimpse(smple)
Rows: 6
Columns: 10
$ mealTime                <dttm> 2018-11-17 22:00:00, 2020-01-12 18:00:00, 20~
$ weight_measure_time     <dttm> 2018-11-14 20:06:40, 2020-01-12 07:00:00, 20~
$ height_measure_time     <dttm> 2018-11-13 10:53:54, 2018-11-27 16:44:03, 20~
$ hba1c_measure_time      <dttm> 2018-11-13 11:01:00, 2019-11-11 07:00:00, 20~
$ bpMeasureTime           <dttm> 2018-11-16 15:03:43, 2020-01-12 07:00:00, 20~
$ mealTime_day            <chr> "Saturday", "Sunday", "Sunday", "Thursday", "~
$ weight_measure_time_day <chr> "Wednesday", "Sunday", "Tuesday", "Thursday",~
$ height_measure_time_day <chr> "Tuesday", "Tuesday", "Tuesday", "Friday", "F~
$ hba1c_measure_time_day  <chr> "Tuesday", "Monday", "Monday", "Friday", "Thu~
$ bpMeasureTime_day       <chr> "Friday", "Sunday", "Tuesday", "Thursday", "W~

在基础 R 中,我可以按如下方式实现上述目标:

smple[paste(colnames(smple), "day", sep="_")] = apply(smple, 2, lubridate::wday, label=TRUE, abbr=FALSE)

我想知道 tidyverse 中是否有类似的方法,它通过评估 LHS 和 RHS 来动态添加列。

利用的across ,并where你可以这样做:

library(dplyr)
library(lubridate)

mutate(smpl, across(where(is.POSIXct), lubridate::wday, 
                    label=TRUE, abbr=FALSE, .names = "{.col}_day"))
#> # A tibble: 6 x 10
#>   mealTime            weight_measure_time height_measure_time
#>   <dttm>              <dttm>              <dttm>             
#> 1 2018-11-17 22:00:00 2018-11-14 20:06:40 2018-11-13 10:53:54
#> 2 2020-01-12 18:00:00 2020-01-12 07:00:00 2018-11-27 16:44:03
#> 3 2020-11-01 18:00:00 2020-07-14 07:00:00 2018-11-27 16:44:03
#> 4 2018-12-27 09:00:00 2018-12-27 07:46:02 2018-11-16 17:06:28
#> 5 2019-02-10 18:00:00 2019-01-02 08:13:43 2018-11-16 11:09:07
#> 6 2018-12-08 22:00:00 2018-12-08 00:00:45 2018-11-21 12:10:28
#> # … with 7 more variables: hba1c_measure_time <dttm>, bpMeasureTime <dttm>,
#> #   mealTime_day <dbl>, weight_measure_time_day <dbl>,
#> #   height_measure_time_day <dbl>, hba1c_measure_time_day <dbl>,
#> #   bpMeasureTime_day <dbl>

这是解决您的问题的一种方法:

df[paste0(names(df), "_day")] <- lapply(df, weekdays)

基础 R 解决方案:

cbind(
  df,
  setNames(
    data.frame(
      Map(
        weekdays, 
        df
        )
      ), 
    paste0(
      names(df),
      ifelse(
        grepl(
          "_", 
          names(df)
        ),
      "_day_of_week",
      "DayOfWeek"
      )
    )
  )
)

dplyr解决方案仅使用基础 R 中的weekdays

library(dplyr)
df %>% 
    mutate(across(everything(), weekdays, .names = "{.col}_day"))

输出:

 mealTime            weight_measure_time height_measure_time hba1c_measure_time  bpMeasureTime       mealTime_day weight_measure_time_day
  <dttm>              <dttm>              <dttm>              <dttm>              <dttm>              <chr>        <chr>                  
1 2018-11-17 22:00:00 2018-11-14 20:06:40 2018-11-13 10:53:54 2018-11-13 11:01:00 2018-11-16 15:03:43 Samstag      Mittwoch               
2 2020-01-12 18:00:00 2020-01-12 07:00:00 2018-11-27 16:44:03 2019-11-11 07:00:00 2020-01-12 07:00:00 Sonntag      Sonntag                
3 2020-11-01 18:00:00 2020-07-14 07:00:00 2018-11-27 16:44:03 2020-07-13 07:30:00 2020-03-03 07:00:00 Sonntag      Dienstag               
4 2018-12-27 09:00:00 2018-12-27 07:46:02 2018-11-16 17:06:28 2018-12-14 10:00:00 2018-12-27 07:46:14 Donnerstag   Donnerstag             
5 2019-02-10 18:00:00 2019-01-02 08:13:43 2018-11-16 11:09:07 2018-12-27 14:22:00 2019-01-02 08:13:57 Sonntag      Mittwoch               
6 2018-12-08 22:00:00 2018-12-08 00:00:45 2018-11-21 12:10:28 2018-12-06 11:43:00 2018-12-08 10:48:30 Samstag      Samstag                
# ... with 3 more variables: height_measure_time_day <chr>, hba1c_measure_time_day <chr>, bpMeasureTime_day <chr>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM