简体   繁体   English

如何在 tibble 的每一行和数据框的嵌套列表上应用 function

[英]How to apply a function on each row of a tibble and a nested list of data frames

I have the following tibble and a nested list of data frames:我有以下小标题和嵌套的数据框列表:

>source

# A tibble: 6 × 2
    lon   lat
  <dbl> <dbl>
1  6.02  55.1
2  6.02  55.0
3  6.02  54.9


>dest

[[1]][[1]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

[[1]][[2]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

[[1]][[3]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

[[2]][[1]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

[[2]][[2]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

[[2]][[3]]
         lon      lat
1   54.98908 6.900084
2   54.92777 6.772623
3   55.09501 6.911837

I would like to apply a function on a row from a tible source and to each "block" from dest.我想将 function 从 tible source 应用到一行,并从 dest 应用到每个“块”。

Example:例子:

row 1 from source should by applied to each row from dest[[1]][[1]] and dest[[2]][[1]]来自源的row 1应该应用于来自dest[[1]][[1]]dest[[2]][[1]]每一行

row 2 from source should by applied to each row from dest[[1]][[2]] and dest[[2]][[2]]来自源的row 2应该应用于来自dest[[1]][[2]]dest[[2]][[2]]每一行

row 3 from source should by applied to each row from dest[[1]][[3]] and dest[[2]][[3]]来自源的row 3应该应用于来自dest[[1]][[3]]dest[[2]][[3]]每一行

and so on.等等。

How could I make this happen?我怎样才能做到这一点? I got tangled up with apply,lappl and maply and would appreciate any help.我纠结于 apply、lappl 和 maply,希望能得到任何帮助。

source<-structure(list(lon = c(6.02125801226333, 6.02125801226333, 6.02125801226333, 
6.02125801226333, 6.02125801226333, 6.02125801226333), lat = c(55.0579432585625, 
54.9681151832365, 54.8782857724705, 54.7884550247254, 54.6986229384757, 
54.6087895122085)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

dest<-list(list(structure(list(lon = c(55.0446726604773, 55.0911992769466, 
55.1399831259253), lat = c(6.11070373013145, 5.93718385855719, 
6.05909963519238)), class = "data.frame", row.names = c(NA, -3L
)), structure(list(lon = c(54.963042116042, 54.9238652445021, 
54.9948148730435), lat = c(6.11154210955708, 6.10009257140253, 
5.93487232950475)), class = "data.frame", row.names = c(NA, -3L
)), structure(list(lon = c(54.9181540526, 54.9628448755405, 54.8174082489187
), lat = c(5.94011737583315, 5.98947008604159, 6.08806491235748
)), class = "data.frame", row.names = c(NA, -3L)), structure(list(
    lon = c(54.7263291045393, 54.8728552727446, 54.8675223815364
    ), lat = c(5.95561986508533, 6.0534792303467, 5.97754320721106
    )), class = "data.frame", row.names = c(NA, -3L)), structure(list(
    lon = c(54.7185472365059, 54.7069293987346, 54.78280968399
    ), lat = c(5.93305860952388, 5.93121414118021, 5.9884946645099
    )), class = "data.frame", row.names = c(NA, -3L)), structure(list(
    lon = c(54.560413160877, 54.5853088068835, 54.5185005363673
    ), lat = c(6.0976246910947, 5.93394019791707, 6.02387338808233
    )), class = "data.frame", row.names = c(NA, -3L))), list(
    structure(list(lon = c(55.050226235055, 55.0240838617402, 
    54.9636263846607), lat = c(5.90235917535441, 5.90965086672992, 
    5.97880750058409)), class = "data.frame", row.names = c(NA, 
    -3L)), structure(list(lon = c(55.0746706563331, 55.0478637437921, 
    54.8541974469044), lat = c(5.98859383669152, 5.92618888252071, 
    6.04742105597978)), class = "data.frame", row.names = c(NA, 
    -3L)), structure(list(lon = c(54.7575000883344, 54.7676512681177, 
    54.9427732774055), lat = c(6.06061526193956, 6.09764527834345, 
    5.90903632630959)), class = "data.frame", row.names = c(NA, 
    -3L)), structure(list(lon = c(54.7776555082601, 54.8462348683655, 
    54.7620026570004), lat = c(6.1346781687426, 6.12031707754559, 
    5.91627897917598)), class = "data.frame", row.names = c(NA, 
    -3L)), structure(list(lon = c(54.6176186034159, 54.7833923796146, 
    54.6922873458308), lat = c(6.10088997672983, 6.09177636538747, 
    6.14915348430183)), class = "data.frame", row.names = c(NA, 
    -3L)), structure(list(lon = c(54.5680535136696, 54.5386600427152, 
    54.5879440622283), lat = c(6.13919150641202, 5.91144136237118, 
    5.89113937054887)), class = "data.frame", row.names = c(NA, 
    -3L))))

We could split the source into a list by rows, and then use mapply with lapply :我们可以按行将源split为一个列表,然后将mapplylapply一起使用:

Example using dplyr::bind_cols as the function to be applied.使用dplyr::bind_cols作为要应用的 function 的示例。

lapply(dest,
       \(x) mapply(dplyr::bind_cols, split(source, seq(nrow(source))), x, SIMPLIFY = FALSE))

Output : Output :

[[1]]
[[1]]$`1`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    55.1    55.0    6.11
2    6.02    55.1    55.1    5.94
3    6.02    55.1    55.1    6.06

[[1]]$`2`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    55.0    55.0    6.11
2    6.02    55.0    54.9    6.10
3    6.02    55.0    55.0    5.93

[[1]]$`3`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.9    54.9    5.94
2    6.02    54.9    55.0    5.99
3    6.02    54.9    54.8    6.09

[[1]]$`4`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.8    54.7    5.96
2    6.02    54.8    54.9    6.05
3    6.02    54.8    54.9    5.98

[[1]]$`5`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.7    54.7    5.93
2    6.02    54.7    54.7    5.93
3    6.02    54.7    54.8    5.99

[[1]]$`6`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.6    54.6    6.10
2    6.02    54.6    54.6    5.93
3    6.02    54.6    54.5    6.02


[[2]]
[[2]]$`1`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    55.1    55.1    5.90
2    6.02    55.1    55.0    5.91
3    6.02    55.1    55.0    5.98

[[2]]$`2`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    55.0    55.1    5.99
2    6.02    55.0    55.0    5.93
3    6.02    55.0    54.9    6.05

[[2]]$`3`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.9    54.8    6.06
2    6.02    54.9    54.8    6.10
3    6.02    54.9    54.9    5.91

[[2]]$`4`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.8    54.8    6.13
2    6.02    54.8    54.8    6.12
3    6.02    54.8    54.8    5.92

[[2]]$`5`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.7    54.6    6.10
2    6.02    54.7    54.8    6.09
3    6.02    54.7    54.7    6.15

[[2]]$`6`
# A tibble: 3 × 4
  lon...1 lat...2 lon...3 lat...4
    <dbl>   <dbl>   <dbl>   <dbl>
1    6.02    54.6    54.6    6.14
2    6.02    54.6    54.5    5.91
3    6.02    54.6    54.6    5.89

A loop could do it (the function here is a simple addition):一个循环就可以做到(这里的 function 是一个简单的加法):

for(each_row in 1:nrow(source)) {
  for(each_list in 1:length(dest)) {
    dest[[each_list]][[each_row]][["lon"]] <- dest[[each_list]][[each_row]][["lon"]]+source[[each_row, "lon"]]
    dest[[each_list]][[each_row]][["lat"]] <- dest[[each_list]][[each_row]][["lat"]]+source[[each_row, "lat"]]
  }
}

Output: Output:

[[1]]
[[1]][[1]]
       lon      lat
1 61.06593 61.16865
2 61.11246 60.99513
3 61.16124 61.11704

[[1]][[2]]
       lon      lat
1 60.98430 61.07966
2 60.94512 61.06821
3 61.01607 60.90299

[[1]][[3]]
       lon      lat
1 60.93941 60.81840
2 60.98410 60.86776
3 60.83867 60.96635

[[1]][[4]]
       lon      lat
1 60.74759 60.74407
2 60.89411 60.84193
3 60.88878 60.76600

[[1]][[5]]
       lon      lat
1 60.73981 60.63168
2 60.72819 60.62984
3 60.80407 60.68712

[[1]][[6]]
       lon      lat
1 60.58167 60.70641
2 60.60657 60.54273
3 60.53976 60.63266


[[2]]
[[2]][[1]]
       lon      lat
1 61.07148 60.96030
2 61.04534 60.96759
3 60.98488 61.03675

[[2]][[2]]
       lon      lat
1 61.09593 60.95671
2 61.06912 60.89430
3 60.87546 61.01554

[[2]][[3]]
       lon      lat
1 60.77876 60.93890
2 60.78891 60.97593
3 60.96403 60.78732

[[2]][[4]]
       lon      lat
1 60.79891 60.92313
2 60.86749 60.90877
3 60.78326 60.70473

[[2]][[5]]
       lon      lat
1 60.63888 60.79951
2 60.80465 60.79040
3 60.71355 60.84778

[[2]][[6]]
       lon      lat
1 60.58931 60.74798
2 60.55992 60.52023
3 60.60920 60.49993

If I follow, each destination is someplace near the equator and each source is someplace in the north.如果我跟随,每个目的地都在赤道附近的某个地方,每个源头都在北方的某个地方。 For each destination block you want to add the source lat and long so you can do something like compute the distance between the two.对于每个目标块,您要添加源纬度和经度,以便您可以执行一些操作,例如计算两者之间的距离。

So the result should look something like:所以结果应该类似于:

> dest2[[1]][[1]]
       lon      lat  lon_src  lat_src
1 55.04467 6.110704 6.021258 55.05794
2 55.09120 5.937184 6.021258 55.05794
3 55.13998 6.059100 6.021258 55.05794

This code will accomplish this.此代码将完成此操作。 The code could be more efficient if you are working with a large data set.如果您处理的是大型数据集,代码可能会更有效率。

dest2 <- dest

addStart <- function(startRow, destElements, group) {
  start <- source[startRow, ]
  
  for (i in destElements) {
    rows007 <- nrow(dest[[i]][[group]])
    toadd = data.frame( matrix(rep(start, each = rows007), ncol = 2) )
    names(toadd) = c("lon_src","lat_src")
    dest2[[i]][[group]] <- cbind(dest[[i]][[group]],toadd)
  
  }
  return(dest2)
}

dest2 <- addStart(1, 1:2, 1)
dest2[[1]][[1]]
dest2[[2]][[1]]

dest2 <- addStart(2, 1:2, 2)
dest2[[1]][[2]]
dest2[[2]][[2]]

dest2 <- addStart(3, 1:2, 3)
dest2[[1]][[3]]
dest2[[2]][[3]]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将函数应用于嵌套列表的每一行? - How to apply a function to each row of a nested list? 使用lapply将函数应用于tibble中的每一行 - Using lapply to apply function to each row in a tibble 将功能应用于数据帧的每一行并返回数据帧列表的有效方法 - Efficient way to apply function to each row of data frame and return list of data frames 如何将嵌套在 tibble 中的列表转换为单个 tibble,每个列表项都位于新行上? - How to convert lists nested in a tibble into a single tibble with each list item on a new row? 如果一行包含列表()/没有嵌套的tibble,如何过滤嵌套的tibble - How to filter nested tibble if one row contains list()/no nested tibble 嵌套在列表中的data.frames列的每一行的均值 - Mean of each row of data.frames column nested in list 将函数应用于数据框列表中的每个因子级别 - Apply a function to each factor level in a list of data frames 如何为嵌套列表中具有不同大小的数据框在列中应用函数? - How to apply functions in columns for data frames with different sizes in nested list? 在包含数据帧的列表列的小标题中,如何使用自定义 function 包装 mutate(foo = map2(...))? - In a tibble that has list-columns containing data frames, how to wrap mutate(foo = map2(...)) with a custom function? 如何在标题的每个元素上应用“复杂的”用户定义函数 - How to apply a “complicated” user defined function on each element of a tibble
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM