簡體   English   中英

dplyr 從兩列中改變一個序列

[英]dplyr mutate a sequence from two columns

我有一系列列表,我想“擴展”並采用序列。 數據看起來像:

[[10]]
  minX maxX minY maxY
1  4.9  7.9  4.9  7.9

[[11]]
  minX maxX minY maxY
1    2  3.8    2  3.8

[[12]]
  minX maxX minY maxY
1    3  6.9    3  6.9

我想創建類似的東西:

x <- var_lists[[1]]
seq(x[1,1], x[1, 2], length.out= 100)

但是按名稱,因此類似於seq(x["minX"], x["maxX"], length.out= 100)因為我也想對minYmaxY列執行此操作。

所以我最終會得到兩個新列,它們是從minXmaxXminYmaxY的序列。

我在dplyr管道中工作,所以我想使用mutate或一些tidyverse函數來做到這一點。

數據:

var_lists <- list(structure(list(minX = 2, maxX = 3.8, minY = 2, maxY = 3.8), row.names = c(NA, 
-1L), class = "data.frame"), structure(list(minX = 3, maxX = 6.9, 
    minY = 3, maxY = 6.9), row.names = c(NA, -1L), class = "data.frame"), 
    structure(list(minX = 1, maxX = 2.5, minY = 1, maxY = 2.5), row.names = c(NA, 
    -1L), class = "data.frame"), structure(list(minX = 4.9, maxX = 7.9, 
        minY = 4.9, maxY = 7.9), row.names = c(NA, -1L), class = "data.frame"), 
    structure(list(minX = 3, maxX = 6.9, minY = 3, maxY = 6.9), row.names = c(NA, 
    -1L), class = "data.frame"), structure(list(minX = 1, maxX = 2.5, 
        minY = 1, maxY = 2.5), row.names = c(NA, -1L), class = "data.frame"), 
    structure(list(minX = 4.9, maxX = 7.9, minY = 4.9, maxY = 7.9), row.names = c(NA, 
    -1L), class = "data.frame"), structure(list(minX = 2, maxX = 3.8, 
        minY = 2, maxY = 3.8), row.names = c(NA, -1L), class = "data.frame"), 
    structure(list(minX = 1, maxX = 2.5, minY = 1, maxY = 2.5), row.names = c(NA, 
    -1L), class = "data.frame"), structure(list(minX = 4.9, maxX = 7.9, 
        minY = 4.9, maxY = 7.9), row.names = c(NA, -1L), class = "data.frame"), 
    structure(list(minX = 2, maxX = 3.8, minY = 2, maxY = 3.8), row.names = c(NA, 
    -1L), class = "data.frame"), structure(list(minX = 3, maxX = 6.9, 
        minY = 3, maxY = 6.9), row.names = c(NA, -1L), class = "data.frame"))

我們可以使用map遍歷list ,用$提取一行列並應用seq

library(purrr)
map_dfr(var_lists, ~ tibble(x = seq(.x$minX, .x$maxX, length.out = 100),
               y = seq(.x$minY, .x$maxY, length.out = 100)), .id = 'grp')
# A tibble: 1,200 x 3
#       x     y grp  
#   <dbl> <dbl> <chr>
# 1  2     2    1    
# 2  2.02  2.02 1    
# 3  2.04  2.04 1    
# 4  2.05  2.05 1    
# 5  2.07  2.07 1    
# 6  2.09  2.09 1    
# 7  2.11  2.11 1    
# 8  2.13  2.13 1    
# 9  2.15  2.15 1    
#10  2.16  2.16 1    
# … with 1,190 more rows

如果有很多列('X','Y','Z',...),另一種選擇是使用pivot_longer重新pivot_longer為 'long' 格式,然后應用於所有列

library(dplyr)
library(tidyr)
map_dfr(var_lists,  ~ 
           .x %>%
             pivot_longer(cols = everything(), names_to = c("group", ".value"),
      names_sep = "(?<=[a-z])(?=[A-Z])") %>% 
      summarise_at(-1, ~ seq(.[1], .[2], length.out = 100)), .id = 'grp') %>%
      as_tibble
# A tibble: 1,200 x 3
#       X     Y grp  
#   <dbl> <dbl> <chr>
# 1  2     2    1    
# 2  2.02  2.02 1    
# 3  2.04  2.04 1    
# 4  2.05  2.05 1    
# 5  2.07  2.07 1    
# 6  2.09  2.09 1    
# 7  2.11  2.11 1    
# 8  2.13  2.13 1    
# 9  2.15  2.15 1    
#10  2.16  2.16 1    
# … with 1,190 more rows

注意:如果我們需要將其保留為list ,請將map_dfr替換為map后綴_dfr建議返回單個 data.frame 行綁定,即如果它是_dfc ,它將是列綁定。 在第二個解決方案中,如果它被替換為map ,那么%>% as_tibble也應該被刪除,因為它期望來自上一步的單個 data.frame

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM