简体   繁体   English

从R中的长地址字符串中提取(long,lat)

[英]extract (long, lat) from long address string in R

4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)
3700 ross ave dallas, tx 75204 (32.797677, -96.786384)

I have a column in a dataframe that has values like listed above i want to create a 2 new fields long and lat that have the values between ,我在数据框中有一列具有上面列出的值我想创建一个 2 个新字段 long 和 lat ,其值介于,

this is what i have so far这是我迄今为止所拥有的

data$longlat<-str_split(data$geocoded_column,sub("\\(.*", "", data$geocoded_column))
data$longlat<-str_sub(data$longlat,start=9)
which gives me 
32.728761, -96.895678)"
32.797677, -96.786384)"

You can extract the values using stringr and lookaround:您可以使用stringr和 lookaround 提取值:

library(stringr)
str_extract_all(x, "(?<=\\()[^(]+(?=\\))")
[[1]]
[1] "32.728761, -96.895678" "32.797677, -96.786384"

To get the values into a dataframe:要将值放入数据框中:

df <- data.frame(
  long = unlist(str_extract_all(x, "(?<=\\()[^(,]+(?=,.*\\))")),
  lat = unlist(str_extract_all(x, "(?<=, )[^(]+(?=\\))"))
)
df
       long        lat
1 32.728761 -96.895678
2 32.797677 -96.786384

Data:数据:

x <- "4517 bessie dr dallas, tx 75211 (32.728761, -96.895678) 3700 ross ave dallas, tx 75204 (32.797677, -96.786384)"

Does this work?这行得通吗?

library(dplyr)
library(tidyr)
df <- data.frame(c1 = c('4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)','3700 ross ave dallas, tx 75204 (32.797677, -96.786384)'))
df
                                                       c1
1 4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)
2  3700 ross ave dallas, tx 75204 (32.797677, -96.786384)
df %>% extract(col = c1, into = c('lat','lon'), regex = '(-?\\d+\\.\\d+), (-?\\d+\\.\\d+)', remove = F)
                                                       c1       lat        lon
1 4517 bessie dr dallas, tx 75211 (32.728761, -96.895678) 32.728761 -96.895678
2  3700 ross ave dallas, tx 75204 (32.797677, -96.786384) 32.797677 -96.786384
 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM