简体   繁体   中英

How to find the distance between two data frames

I have two data frames, the first one:

first <- rnorm(3)
second <- rnorm(3)
third <- rnorm (3)

MainNumbers = data.frame(first,second,third)

And the second one:

Operator1= rnorm(50)
Operator2= rnorm(50)

MainOperator= data.frame(Operator1,Operator2)

And by using the Euclidean distance :

euc.dist <- function(x1, x2) sqrt(sum((x1 - x2) ^ 2))

I would like to see if its possible to do this:

  1. Every value in a row in MainNumbers should be the x1 value in euc.dist

  2. Every value in a row in MainOperator should be the x2 value in euc.dist

  3. For each value in x1 calculate every euc.dist with each value in x2 . There are less values in MainNumbers and more in MainOperator. That means I should take the first row of the MainNumbers as x1 and every value in MainOperator as x2 . The result should be a column made only by the calculation of the first row of
    MainNumbers and every number in MainOperator. The second column should be the
    result of the calculation of the second row of MainNumbers as the x1 and as the x2 every value in the MainOperator. The same for the third row and every value in MainOperator.

  4. The final result should be a data frame with 3 columns so that the columns should be filled with values from the euc. dist euc. dist from two data frames, where each columns is a result of the one row of MainNumbers and every value of MainOperator.

Thanks a lot!

EDIT: Here are my attempts to solve this:

I have a vague idea how to solve this and this is the pseudoce I had in mind:

define empty data frame
for each i in row(MainNumbers) use as x1 
for j in row(Mainoperator) use as x2
calculate euc. dist 
save it in a empty data  frame as column
repeat for each i rown(MainNumbers)

Thus far I am stuck with the for loops and how to put every values in x1 and to calculate the euc. dist. with every other value in the second data frame. I use this code:

for (i in 1:nrow(MainNumbers)) {

  x=as.numeric(as.vector(MainNumbers[i,]))

}

With this I can get that the firs row is a numeric vector, and then I can use it as a variable in the eust. dist. formula. But I do not know how to take every row and use it as a seperate numeric vector. I am not really good with complicated for loops, to be honest.

我自己找到了它:

apply(MainOperator,1,function(MainOperator) (apply(MainNumbers,1,function(MainNumbers,MainOperator)dist(rbind(MainNumbers,MainOperator)),MainOperator)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM