简体   繁体   English

如何从 1 个 df 的第 i+1 列中获取值并计算到第 2 个 df 的第 i+1 列中每一行中的值的距离

[英]How to take value from i+1th column of 1 df and calculate distance to values in every row in i+1th column of 2nd df

Suppose I have the following two dataframes (with uneven rows)假设我有以下两个数据框(行不均匀)

set.seed(1999)
dfA <- data.frame(x = rpois(10,2), y = rpois(10,2), z = rpois(10,2), q = rpois(10,2), t = rpois(10,2))

set.seed(24)
dfB <- data.frame(a = rpois(10,2), b = rpois(10,2), c = rpois(10,2), d = rpois(10,2), e = rpois(10,2))

set.seed(10)
Dx <- sample.int(5)
set.seed(6)
Dy <- sample.int(5)

Dx <- as.data.frame(Dx)
Dx <- as.data.frame(transpose(Dx))
Dy <- as.data.frame(Dy)
Dy <- as.data.frame(transpose(Dy))

dfAB <- map2_df(dfA, dfB, str_c, sep=",") %>%
  rename_all(~ str_c('C', seq_along(.)))
dfXY <- map2_df(Dx, Dy, str_c, sep=",") %>%
  rename_all(~ str_c('C', seq_along(.)))

Now I have 2 datasets of coordinates (dfAB 5 variables each with 10 observations, dataset dfXY 5 variables with 1 observation).现在我有 2 个坐标数据集(dfAB 5 个变量,每个变量有 10 个观测值,数据集 dfXY 5 个变量有 1 个观测值)。

What I would like to do is to find the distance between the observation of variable 1 of dfXY and every individual observation in variable 1 of dfAB, the distance between observation 1 of variable 2 of dfXY and every individual observation in variable 2 of dfAB, etc.我想做的是找到 dfXY 变量 1 的观察值与 dfAB 变量 1 中每个单独观察值之间的距离,dfXY 变量 2 的观察值 1 与 dfAB 变量 2 中每个单独观察值之间的距离,等等.

dfAB                          dfXY
3,1   3,2  ...       3,5  1,2  2,1  5,4  4,3   
2,1   3,1                  
2,3   1,2               
...   ...            

ie the distance between: a) 3,5 & 3,1 b) 3,5 & 2,1 c) 3,5 & 2,3 etc...即之间的距离:a) 3,5 & 3,1 b) 3,5 & 2,1 c) 3,5 & 2,3 等等...
and the distance between: a) 1,2 & 3,2 b) 1,2 & 3,1 c) 1,2 & 1,2 etc..和之间的距离:a) 1,2 & 3,2 b) 1,2 & 3,1 c) 1,2 & 1,2 等等。
and so on.等等。

If the datasets had equal amount of observations I could use:如果数据集具有相同数量的观察值,我可以使用:

distances <- map2_df(
  dfAB,
  dfXY,
  ~ sqrt((.x$x - .y$x)^2 + (.x$y - .y$y)^2)
)

But since dfXY only have 1 observation (to be compared with repeatedly), this does not work.但是由于 dfXY 只有 1 个观察值(要反复比较),所以这是行不通的。 I think I need to use something like a for(i in seq_along()) function but I do not know how to incorporate the ~ sqrt((.x$x -.y$x)^2 + (.x$y -.y$y)^2)我想我需要使用类似for(i in seq_along())函数的东西,但我不知道如何合并~ sqrt((.x$x -.y$x)^2 + (.x$y -.y$y)^2)

distance <- for(i in seq_along(dfXY)){
  dfAB[,i] <- dfAB[,i] [WHAT TO PUT HERE]

Any help is much appreciated任何帮助深表感谢

I'm having a bit of a hard time following what you're trying to do here, but I think you may be making things too needlessly complicated for yourself.我很难理解你在这里要做的事情,但我认为你可能会让事情变得太过不必要地复杂化。

For example, instead of nesting map2() call inside a lapply() call, I think you can achieve pretty much the same result without iteration using bind_cols() :例如,不是在lapply()调用中嵌套map2()调用,我认为您可以在不使用bind_cols()迭代的情况下获得几乎相同的结果:

dfA <- tibble(x = rpois(10,2), y = rpois(10,2), z = rpois(10,2), q = rpois(10,2), t = rpois(10,2))
dfB <- tibble(x = rpois(10,2), y = rpois(10,2), z = rpois(10,2), q = rpois(10,2), t = rpois(10,2))

df_abt <- dfA %>%
  bind_cols(dfB) %>%
  select(x, x1, y, y1, z, z1, q, q1, t, t1)

For dataframes C and D, you can use iteration with map to avoid having to transpose them:对于数据框 C 和 D,您可以使用带地图的迭代来避免转置它们:

dfC <- map(1:5, ~ .x) %>% bind_cols()
dfD <- map(11:15, ~.x) %>% bind_cols()

df_cdt <- dfC %>%
  bind_cols(dfD) %>%
  select(V1, V11, V2, V21, V3, V31, V4, V41, V5, V51)

(actually why not just store df_cdt as a vector? is there a reason it needs to be a data frame?) (实际上为什么不将 df_cdt 存储为向量?是否有理由需要将其作为数据框?)

As for distances, I reckon this should work:至于距离,我认为这应该可行:

df_dist <- map2_df(df_abt, df_cdt, ~ sqrt((.x - .y)^2))

If you have an unequal number of rows in df_abt, why not just pad out the missing rows with NA's?如果 df_abt 中的行数不相等,为什么不用 NA 填充缺失的行呢? I mean, it won't let you build a dataframe with columns of different length anyway.我的意思是,它不会让你构建一个包含不同长度列的数据框。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 R 中:如何从 1 个数据帧的第 i+1 行中获取值并从第二个数据帧的第 i+1 列中的每一行中减去 - in R: how to take value from i+1th row of 1 dataframe and subtract from every row in i+1th column of 2nd dataframe 如果重新安置,删除第i + 1个术语 - remove i+1th term if reoccuring 删除 r 矩阵中的每 2 和 5 列 - Removing every 2nd and 5th column in a matrix in r 如何将一个 df 列中的每个值与 R 中另一个 df 列中的每个值进行比较? 具有不同行号的dfs - How to compare every value from a column from one df with every value from a column of another df in R? dfs with different row numbers 如何创建一个 df/dt,其中每一列都是现有 df/dt 的一行的值,没有循环? - How can I create a df/dt where each column is the value of a row of an existing df/dt without loops? 从df col名称中的DF列中查找值,取相应行的值 - Lookup value from DF column in df col names, take value for corresponding row 如何计算第二列中分隔的逗号总和 - How can I calculate the sum of comma separated in the 2nd column 如何将每第 n (9) 列转换为 R 中的新行? - How can I transpose every nth (9th) column to a new row in R? 计算数据框中每第二列的平均值 - Calculate mean of every 2nd column in a dataframe 如何根据另一个df中的值使用向量中的值将值分配给df的列 - How to assign a value to a column of a df using values from a vector according to a value in another df
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM