[英]extract (long, lat) from long address string in R
4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)
3700 ross ave dallas, tx 75204 (32.797677, -96.786384)
I have a column in a dataframe that has values like listed above i want to create a 2 new fields long and lat that have the values between ,
我在数据框中有一列具有上面列出的值我想创建一个 2 个新字段 long 和 lat ,其值介于
,
this is what i have so far这是我迄今为止所拥有的
data$longlat<-str_split(data$geocoded_column,sub("\\(.*", "", data$geocoded_column))
data$longlat<-str_sub(data$longlat,start=9)
which gives me
32.728761, -96.895678)"
32.797677, -96.786384)"
You can extract the values using stringr
and lookaround:您可以使用
stringr
和 lookaround 提取值:
library(stringr)
str_extract_all(x, "(?<=\\()[^(]+(?=\\))")
[[1]]
[1] "32.728761, -96.895678" "32.797677, -96.786384"
To get the values into a dataframe:要将值放入数据框中:
df <- data.frame(
long = unlist(str_extract_all(x, "(?<=\\()[^(,]+(?=,.*\\))")),
lat = unlist(str_extract_all(x, "(?<=, )[^(]+(?=\\))"))
)
df
long lat
1 32.728761 -96.895678
2 32.797677 -96.786384
Data:数据:
x <- "4517 bessie dr dallas, tx 75211 (32.728761, -96.895678) 3700 ross ave dallas, tx 75204 (32.797677, -96.786384)"
Does this work?这行得通吗?
library(dplyr)
library(tidyr)
df <- data.frame(c1 = c('4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)','3700 ross ave dallas, tx 75204 (32.797677, -96.786384)'))
df
c1
1 4517 bessie dr dallas, tx 75211 (32.728761, -96.895678)
2 3700 ross ave dallas, tx 75204 (32.797677, -96.786384)
df %>% extract(col = c1, into = c('lat','lon'), regex = '(-?\\d+\\.\\d+), (-?\\d+\\.\\d+)', remove = F)
c1 lat lon
1 4517 bessie dr dallas, tx 75211 (32.728761, -96.895678) 32.728761 -96.895678
2 3700 ross ave dallas, tx 75204 (32.797677, -96.786384) 32.797677 -96.786384
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.