[英]How to apply a function on each row of a tibble and a nested list of data frames
I have the following tibble and a nested list of data frames:我有以下小标题和嵌套的数据框列表:
>source
# A tibble: 6 × 2
lon lat
<dbl> <dbl>
1 6.02 55.1
2 6.02 55.0
3 6.02 54.9
>dest
[[1]][[1]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
[[1]][[2]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
[[1]][[3]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
[[2]][[1]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
[[2]][[2]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
[[2]][[3]]
lon lat
1 54.98908 6.900084
2 54.92777 6.772623
3 55.09501 6.911837
I would like to apply a function on a row from a tible source and to each "block" from dest.我想将 function 从 tible source 应用到一行,并从 dest 应用到每个“块”。
Example:例子:
row 1
from source should by applied to each row from dest[[1]][[1]]
and dest[[2]][[1]]
来自源的row 1
应该应用于来自dest[[1]][[1]]
和dest[[2]][[1]]
的每一行
row 2
from source should by applied to each row from dest[[1]][[2]]
and dest[[2]][[2]]
来自源的row 2
应该应用于来自dest[[1]][[2]]
和dest[[2]][[2]]
的每一行
row 3
from source should by applied to each row from dest[[1]][[3]]
and dest[[2]][[3]]
来自源的row 3
应该应用于来自dest[[1]][[3]]
和dest[[2]][[3]]
的每一行
and so on.等等。
How could I make this happen?我怎样才能做到这一点? I got tangled up with apply,lappl and maply and would appreciate any help.我纠结于 apply、lappl 和 maply,希望能得到任何帮助。
source<-structure(list(lon = c(6.02125801226333, 6.02125801226333, 6.02125801226333,
6.02125801226333, 6.02125801226333, 6.02125801226333), lat = c(55.0579432585625,
54.9681151832365, 54.8782857724705, 54.7884550247254, 54.6986229384757,
54.6087895122085)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
dest<-list(list(structure(list(lon = c(55.0446726604773, 55.0911992769466,
55.1399831259253), lat = c(6.11070373013145, 5.93718385855719,
6.05909963519238)), class = "data.frame", row.names = c(NA, -3L
)), structure(list(lon = c(54.963042116042, 54.9238652445021,
54.9948148730435), lat = c(6.11154210955708, 6.10009257140253,
5.93487232950475)), class = "data.frame", row.names = c(NA, -3L
)), structure(list(lon = c(54.9181540526, 54.9628448755405, 54.8174082489187
), lat = c(5.94011737583315, 5.98947008604159, 6.08806491235748
)), class = "data.frame", row.names = c(NA, -3L)), structure(list(
lon = c(54.7263291045393, 54.8728552727446, 54.8675223815364
), lat = c(5.95561986508533, 6.0534792303467, 5.97754320721106
)), class = "data.frame", row.names = c(NA, -3L)), structure(list(
lon = c(54.7185472365059, 54.7069293987346, 54.78280968399
), lat = c(5.93305860952388, 5.93121414118021, 5.9884946645099
)), class = "data.frame", row.names = c(NA, -3L)), structure(list(
lon = c(54.560413160877, 54.5853088068835, 54.5185005363673
), lat = c(6.0976246910947, 5.93394019791707, 6.02387338808233
)), class = "data.frame", row.names = c(NA, -3L))), list(
structure(list(lon = c(55.050226235055, 55.0240838617402,
54.9636263846607), lat = c(5.90235917535441, 5.90965086672992,
5.97880750058409)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(lon = c(55.0746706563331, 55.0478637437921,
54.8541974469044), lat = c(5.98859383669152, 5.92618888252071,
6.04742105597978)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(lon = c(54.7575000883344, 54.7676512681177,
54.9427732774055), lat = c(6.06061526193956, 6.09764527834345,
5.90903632630959)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(lon = c(54.7776555082601, 54.8462348683655,
54.7620026570004), lat = c(6.1346781687426, 6.12031707754559,
5.91627897917598)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(lon = c(54.6176186034159, 54.7833923796146,
54.6922873458308), lat = c(6.10088997672983, 6.09177636538747,
6.14915348430183)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(lon = c(54.5680535136696, 54.5386600427152,
54.5879440622283), lat = c(6.13919150641202, 5.91144136237118,
5.89113937054887)), class = "data.frame", row.names = c(NA,
-3L))))
We could split
the source into a list by rows, and then use mapply
with lapply
:我们可以按行将源split
为一个列表,然后将mapply
与lapply
一起使用:
Example using dplyr::bind_cols
as the function to be applied.使用dplyr::bind_cols
作为要应用的 function 的示例。
lapply(dest,
\(x) mapply(dplyr::bind_cols, split(source, seq(nrow(source))), x, SIMPLIFY = FALSE))
Output : Output :
[[1]]
[[1]]$`1`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 55.1 55.0 6.11
2 6.02 55.1 55.1 5.94
3 6.02 55.1 55.1 6.06
[[1]]$`2`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 55.0 55.0 6.11
2 6.02 55.0 54.9 6.10
3 6.02 55.0 55.0 5.93
[[1]]$`3`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.9 54.9 5.94
2 6.02 54.9 55.0 5.99
3 6.02 54.9 54.8 6.09
[[1]]$`4`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.8 54.7 5.96
2 6.02 54.8 54.9 6.05
3 6.02 54.8 54.9 5.98
[[1]]$`5`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.7 54.7 5.93
2 6.02 54.7 54.7 5.93
3 6.02 54.7 54.8 5.99
[[1]]$`6`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.6 54.6 6.10
2 6.02 54.6 54.6 5.93
3 6.02 54.6 54.5 6.02
[[2]]
[[2]]$`1`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 55.1 55.1 5.90
2 6.02 55.1 55.0 5.91
3 6.02 55.1 55.0 5.98
[[2]]$`2`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 55.0 55.1 5.99
2 6.02 55.0 55.0 5.93
3 6.02 55.0 54.9 6.05
[[2]]$`3`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.9 54.8 6.06
2 6.02 54.9 54.8 6.10
3 6.02 54.9 54.9 5.91
[[2]]$`4`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.8 54.8 6.13
2 6.02 54.8 54.8 6.12
3 6.02 54.8 54.8 5.92
[[2]]$`5`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.7 54.6 6.10
2 6.02 54.7 54.8 6.09
3 6.02 54.7 54.7 6.15
[[2]]$`6`
# A tibble: 3 × 4
lon...1 lat...2 lon...3 lat...4
<dbl> <dbl> <dbl> <dbl>
1 6.02 54.6 54.6 6.14
2 6.02 54.6 54.5 5.91
3 6.02 54.6 54.6 5.89
A loop could do it (the function here is a simple addition):一个循环就可以做到(这里的 function 是一个简单的加法):
for(each_row in 1:nrow(source)) {
for(each_list in 1:length(dest)) {
dest[[each_list]][[each_row]][["lon"]] <- dest[[each_list]][[each_row]][["lon"]]+source[[each_row, "lon"]]
dest[[each_list]][[each_row]][["lat"]] <- dest[[each_list]][[each_row]][["lat"]]+source[[each_row, "lat"]]
}
}
Output: Output:
[[1]]
[[1]][[1]]
lon lat
1 61.06593 61.16865
2 61.11246 60.99513
3 61.16124 61.11704
[[1]][[2]]
lon lat
1 60.98430 61.07966
2 60.94512 61.06821
3 61.01607 60.90299
[[1]][[3]]
lon lat
1 60.93941 60.81840
2 60.98410 60.86776
3 60.83867 60.96635
[[1]][[4]]
lon lat
1 60.74759 60.74407
2 60.89411 60.84193
3 60.88878 60.76600
[[1]][[5]]
lon lat
1 60.73981 60.63168
2 60.72819 60.62984
3 60.80407 60.68712
[[1]][[6]]
lon lat
1 60.58167 60.70641
2 60.60657 60.54273
3 60.53976 60.63266
[[2]]
[[2]][[1]]
lon lat
1 61.07148 60.96030
2 61.04534 60.96759
3 60.98488 61.03675
[[2]][[2]]
lon lat
1 61.09593 60.95671
2 61.06912 60.89430
3 60.87546 61.01554
[[2]][[3]]
lon lat
1 60.77876 60.93890
2 60.78891 60.97593
3 60.96403 60.78732
[[2]][[4]]
lon lat
1 60.79891 60.92313
2 60.86749 60.90877
3 60.78326 60.70473
[[2]][[5]]
lon lat
1 60.63888 60.79951
2 60.80465 60.79040
3 60.71355 60.84778
[[2]][[6]]
lon lat
1 60.58931 60.74798
2 60.55992 60.52023
3 60.60920 60.49993
If I follow, each destination is someplace near the equator and each source is someplace in the north.如果我跟随,每个目的地都在赤道附近的某个地方,每个源头都在北方的某个地方。 For each destination block you want to add the source lat and long so you can do something like compute the distance between the two.对于每个目标块,您要添加源纬度和经度,以便您可以执行一些操作,例如计算两者之间的距离。
So the result should look something like:所以结果应该类似于:
> dest2[[1]][[1]]
lon lat lon_src lat_src
1 55.04467 6.110704 6.021258 55.05794
2 55.09120 5.937184 6.021258 55.05794
3 55.13998 6.059100 6.021258 55.05794
This code will accomplish this.此代码将完成此操作。 The code could be more efficient if you are working with a large data set.如果您处理的是大型数据集,代码可能会更有效率。
dest2 <- dest
addStart <- function(startRow, destElements, group) {
start <- source[startRow, ]
for (i in destElements) {
rows007 <- nrow(dest[[i]][[group]])
toadd = data.frame( matrix(rep(start, each = rows007), ncol = 2) )
names(toadd) = c("lon_src","lat_src")
dest2[[i]][[group]] <- cbind(dest[[i]][[group]],toadd)
}
return(dest2)
}
dest2 <- addStart(1, 1:2, 1)
dest2[[1]][[1]]
dest2[[2]][[1]]
dest2 <- addStart(2, 1:2, 2)
dest2[[1]][[2]]
dest2[[2]][[2]]
dest2 <- addStart(3, 1:2, 3)
dest2[[1]][[3]]
dest2[[2]][[3]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.