简体   繁体   中英

How to combine two columns with an array

I have DF_1 that shows the cities of origin and destination and I want to know how far (miles / km) they are. In DF_2 I have the distances between cities. How do I know the distances with these two DF?

DF_1 :

origin <- c('LONDON','NEW YORK','TOKIO','LONDON','RIO DE JANEIRO')
destination <- c('NEW YORK','NEW YORK','RIO DE JANEIRO','LISBON','MADRID')
DF_1 <- data.frame(origin,destination)

DF_2 :

CITY <- c('NEW YORK', 'LONDON', 'SAN FRANCISCO', 'MADRID',  'LOS ANGELES', 'LISBON', 'RIO DE JANEIRO', 'MOSCOW',  'SAO PAULO', 'TOKIO')
NEW_YORK <- c(0, 700, 250, 1000, 400, 800, 430, 900, 500, 30) 
LONDON <- c(700, 0, 350, 1200, 50, 110, 780, 984, 1150, 5)
SAN_FRANCISCO <- c(250, 350, 0, 200, 15, 260, 305, 412, 29, 102)
MADRID <- c(1000, 1200, 200, 0, 77, 115, 225, 318, 412, 511)
LOS_ANGELES <- c(400, 50, 15, 77, 0, 88, 819, 733, 978, 1001)
LISBON <- c(800, 110, 260, 115, 88, 0, 17, 3000, 1418, 735)
RIO_DE_JANEIRO <- c(430, 780, 305, 225, 819, 17, 0, 513, 701, 56) 
MOSCOW <- c(900, 984, 412, 318, 733, 3000, 513, 0, 389, 499)
SAO_PAULO <- c(500, 1150, 29, 412, 978, 1418, 701, 389, 0, 1113)
TOKIO <- c(30, 5, 102, 511, 1001, 735, 56, 499, 1113, 0)
DF_2 <- data.frame(CITY, `NEW YORK` = NEW_YORK, LONDON, `SAN FRANCISCO` = SAN_FRANCISCO, MADRID,  `LOS ANGELES` = LOS_ANGELES, LISBON, `RIO DE JANEIRO` = RIO_DE_JANEIRO, MOSCOW,  `SAO PAULO` = SAO_PAULO, TOKIO, check.names = FALSE)

The result I want is this:

origin <- c('LONDON','NEW YORK','TOKIO','LONDON','RIO DE JANEIRO')
destination <- c('NEW YORK','NEW YORK','RIO DE JANEIRO','LISBON','MADRID')
distance <- c(700,0,56,110,225)
DF_FINAL <- data.frame(origin,destination,distance)

using base R: you could use:

transform(DF_1,distance = `rownames<-`(DF_2[,-1],DF_2[,1])[as.matrix(DF_1)])

          origin    destination distance
1         LONDON       NEW YORK      700
2       NEW YORK       NEW YORK        0
3          TOKIO RIO DE JANEIRO       56
4         LONDON         LISBON      110
5 RIO DE JANEIRO         MADRID      225

That is. create a new dataframe with the rownames as the city names:

DF_3 <- DF_2[,-1]#Remove the first column
rownames(DF_3) <- DF_2$CITY #change the rownames:
DF_1$DISTANCE <- DF_3[as.matrix(DF_1)]
DF_1

Here is an option with row/column indexing from base R

i1 <- match(DF_1$origin, DF_2$CITY)
j1 <- match(DF_1$destination, names(DF_2)[-1])
DF_1$distance <- DF_2[-1][cbind(i1, j1)] 
DF_1
#          origin    destination distance
#1         LONDON       NEW YORK      700
#2       NEW YORK       NEW YORK        0
#3          TOKIO RIO DE JANEIRO       56
#4         LONDON         LISBON      110
#5 RIO DE JANEIRO         MADRID      225

This should reproduce exactly what you're looking for (using the tidyverse ):

DF_FINAL <- DF_1 %>%
  inner_join(DF_2, by = c("origin" = "CITY")) %>%
  gather(key = "city", value = "distance", -origin, -destination) %>%
  filter(destination == city) %>%
  select(-c(city))

DF_FINAL
|origin         |destination    | distance|
|:--------------|:--------------|--------:|
|LONDON         |NEW YORK       |      700|
|NEW YORK       |NEW YORK       |        0|
|RIO DE JANEIRO |MADRID         |      225|
|LONDON         |LISBON         |      110|
|TOKIO          |RIO DE JANEIRO |       56|

I try doing this stuff in the tidyverse framework. First step is to turn the matrix of distances into the "long" format. Then, just join that to the original data.frame !

I suggest adding stringsAsFactors = FALSE to the end of your data.frame() definitions to avoid warning messages.

library(tidyr)
library(dplyr)

pivot_longer(DF_2, -CITY) %>%
  rename(origin = CITY, destination = name, distance = value) %>%
  right_join(DF_1)

# A tibble: 5 x 3
  origin         destination    distance
  <chr>          <chr>             <dbl>
1 LONDON         NEW YORK            700
2 NEW YORK       NEW YORK              0
3 TOKIO          RIO DE JANEIRO       56
4 LONDON         LISBON              110
5 RIO DE JANEIRO MADRID              225

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM