简体   繁体   中英

How to dataframe that holds a list inside of a column into multiple columns in R?

I have a column called coordinates that a dataframe of latitude and longitude coordinates for each address. I want to split column into two columns called lat and long rather one column called coordinates

I have the following data:

vsn                             address         coordinates
53 079 Ashland Ave & Elston Ave Chicago IL -87.66826, 41.91873
76 097     Pulaski Rd & 71st St Chicago IL -87.72242, 41.76412
84 0A3  Long Ave & Lawrence Ave Chicago IL -87.76257, 41.96759

the coordinates column contains a list I need the data to transform into the following:

vsn                             address        Lat       Lon  
53 079 Ashland Ave & Elston Ave Chicago IL -87.66826 41.91873
76 097     Pulaski Rd & 71st St Chicago IL -87.72242 41.76412
84 0A3  Long Ave & Lawrence Ave Chicago IL -87.76257 41.96759

I dont know how to extract the data because its a data frame specifically what is shown below

The type of the column coordinates is shown below:

ouput of dput(data$coordinates)

structure(list(coordinates = list(c(-87.668257, 41.918733), c(-87.72242, 
41.764122), c(-87.76257, 41.96759))), row.names = c(53L, 76L, 
84L), class = "data.frame")

Try this-

   > library(splitstackshape)
   > cSplit(dt,"Coordinates")

Note - This function can also help you to trim white spaces while splitting.

Check ?cSplit for more help.

We can use separate

> library(tidyverse)
    > dat %>% 
    separate(coordinates, c("Lat", "Lon"), sep=",") %>% 
    mutate(Lat = as.numeric(Lat),
           Lon = as.numeric(Lon))
# A tibble: 3 x 4
  vsn    address                               Lat   Lon
  <chr>  <chr>                               <dbl> <dbl>
1 53 079 Ashland Ave & Elston Ave Chicago IL -87.7  41.9
2 76 097 Pulaski Rd & 71st St Chicago IL     -87.7  41.8
3 84 0A3 Long Ave & Lawrence Ave Chicago IL  -87.8  42.0

Update

Given your the version of your question, here's a R base solution

> out <- as.data.frame(do.call(rbind, dat$coordinates))
> names(out) <- c("Lat", "Lon")
> out
        Lat      Lon
1 -87.66826 41.91873
2 -87.72242 41.76412
3 -87.76257 41.96759

Because the input data was not shown reproducibly in the question, there is some question regarding whether the coordinates column is actually a list as stated in the question or what was really meant was that it is a column of comma-separated character strings. In the Note at the end we show both reproducibly and here we show how to handle both cases:

coordinates is a column of character strings

library(dplyr)
library(tidyr)

DFstring %>%
  separate(coordinates, c("Lat", "Lon"), sep = ", ", convert = TRUE)

giving:

  vsn                             address       Lat      Lon
1 079 Ashland Ave & Elston Ave Chicago IL -87.66826 41.91873
2 097     Pulaski Rd & 71st St Chicago IL -87.72242 41.76412
3 0A3  Long Ave & Lawrence Ave Chicago IL -87.76257 41.96759

coordinates column is a list

library(dplyr)

DFlist %>%
  rowwise %>%
  mutate(Lat = as.numeric(coordinates[1]), Lon = as.numeric(coordinates[2])) %>%
  ungroup %>%
  select(-coordinates)

giving:

# A tibble: 3 x 4
  vsn   address                               Lat   Lon
  <chr> <chr>                               <dbl> <dbl>
1 079   Ashland Ave & Elston Ave Chicago IL -87.7  41.9
2 097   Pulaski Rd & 71st St Chicago IL     -87.7  41.8
3 0A3   Long Ave & Lawrence Ave Chicago IL  -87.8  42.0

Note

Lines <- "vsn;address;coordinates
079;Ashland Ave & Elston Ave Chicago IL;-87.66826, 41.91873
097;Pulaski Rd & 71st St Chicago IL;-87.72242, 41.76412
0A3;Long Ave & Lawrence Ave Chicago IL;-87.76257, 41.96759"

DFstring <- read.table(text = Lines, header = TRUE, sep = ";", as.is = TRUE,
  strip.white = TRUE)

DFlist <- DFstring
DFlist$coordinates <- strsplit(DFstring$coordinates, ", ")

Update

Note that the code already posted above works with the dput output that was added to the question.

DF <-
structure(list(coordinates = list(c(-87.668257, 41.918733), c(-87.72242, 
41.764122), c(-87.76257, 41.96759))), row.names = c(53L, 76L, 
84L), class = "data.frame")

# same as code above except we use DF as the input
DF %>%
  rowwise %>%
  mutate(Lat = as.numeric(coordinates[1]), Lon = as.numeric(coordinates[2])) %>%
  ungroup %>%
  select(-coordinates)

giving:

# A tibble: 3 x 2
    Lat   Lon
  <dbl> <dbl>
1 -87.7  41.9
2 -87.7  41.8
3 -87.8  42.0

One possibility: map_df() to separate columns, then cbind() the result to the original data frame.

library(dplyr)
library(purrr)

# Example Data
X <- data_frame(
    vsn = c(53, 76, 84),
    coordinates = map(1:3, ~ as.list(rnorm(2)))
)

# Create a new data frame from the list column
purrr::map_df(X$coordinates, ~ data_frame(Lat = .x[[1]], Lon = .x[[2]]))
# A tibble: 3 x 2
    Lat   Lon
  <dbl> <dbl>
1 -1.03 1.45 
2 -1.17 0.794
3  2.06 0.646

Then cbind() to combine with the original data frame

cbind(X, purrr::map_df(X$coordinates, ~ data_frame(Lat = .x[[1]], Lon = .x[[2]])))
  vsn           coordinates       Lat       Lon
1  53   -1.034076, 1.451652 -1.034076 1.4516519
2  76 -1.1738099, 0.7943916 -1.173810 0.7943916
3  84  2.0586963, 0.6462277  2.058696 0.6462277

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM