[英]Separating address column into multiple columns in R
我有一个包含400
条记录的address
列的数据集。 我想将此列拆分为多个列。
样本数据
Full_Address = c("1111 Harding St Hollywood, FL 33024",
"2222 W Broward Blvd Plantation, 33317",
"3333 SW 74 Ave Davie, 33314",
"4444 Thomas Street Hollywood, FL 33024",
"11111 Lake Road (SW 12 Street) Davie, 33325",
"555 Bryan Blvd Plantation, 33317",
"5555 NW 71 Ter Parkland, 33067",
"7777 N Oakland Forest Dr Oakland Park, 33309,
"888 Some Ave Pines Pembroke Pines, 33346",
"9999 Some Blvd Hallandale Beach, 33365",
"4440 Some 123 Ave Pompano Beach, 33389")
所需的列
ID = c("1111",
"2222",
"3333",
"4444",
"11111",
"555",
"5555",
"7777",
"888",
"9999",
"4440")
Street_Address = c("Harding St",
"W Broward Blvd",
"SW 74 Ave",
"Thomas Street",
"Lake Road (SW 12 Street)",
"Bryan Blvd",
"NW 71 Ter",
"N Oakland Forest Dr",
"Some Ave Pines",
"Some Blvd",
"Some 123 Ave")
City = c("Hollywood",
"Plantation",
"Davie",
"Hollywood",
"Davie",
"Plantation",
"Parkland",
"Oakland Park",
"Pembroke Pines",
"Hallandale Beach",
"Pompano Beach")
Zipcode = c("33024",
"33317",
"33314",
"33024",
"33325",
"33317",
"33067",
"33309",
"33346",
"33365",
"33389")
我如何通过tidyr
在R
中执行此操作?
代码
library(tidyverse)
library(tidyr)
df = Full_Address
df = df %>% tidyr::separate( c("ID", "Street_Address", "City", "Zipcode"),
sep = , extra = "merge")) # stuck at this step.....
请注意,这是让一个城市只有一个名称:不会匹配像New York
Los Angeles
这样的城市。
data.frame(Full_Address) %>%
extract(Full_Address, c("ID", "Street_Address", "City", "Zipcode"),
'(\\d+) ([^,]+) (\\w+),\\D+(\\d+)')
ID Street_Address City Zipcode
1 1111 Harding St Hollywood 33024
2 2222 W Broward Blvd Plantation 33317
3 3333 SW 74 Ave Davie 33314
4 4444 Thomas Street Hollywood 33024
5 11111 Lake Road (SW 12 Street) Davie 33325
6 555 Bryan Blvd Plantation 33317
7 5555 NW 71 Ter Parkland 33067
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.