繁体   English   中英

将R中的地址列分隔成多列

[英]Separating address column into multiple columns in R

我有一个包含400条记录的address列的数据集。 我想将此列拆分为多个列。

样本数据

        Full_Address = c("1111 Harding St Hollywood, FL 33024",
                         "2222 W Broward Blvd Plantation, 33317",
                         "3333 SW 74 Ave Davie, 33314",
                         "4444 Thomas Street Hollywood, FL 33024",
                         "11111 Lake Road (SW 12 Street) Davie, 33325",
                         "555 Bryan Blvd Plantation, 33317",
                         "5555 NW 71 Ter Parkland, 33067",
    "7777 N Oakland Forest Dr Oakland Park, 33309,
"888 Some Ave Pines Pembroke Pines, 33346",
"9999 Some Blvd Hallandale Beach, 33365",
"4440 Some 123 Ave Pompano Beach, 33389")

所需的列

        ID = c("1111",
              "2222",
              "3333",
              "4444",
              "11111",
              "555",
              "5555",
    "7777",
"888",
"9999",
"4440")
        
        Street_Address = c("Harding St",
                           "W Broward Blvd",
                           "SW 74 Ave",
                           "Thomas Street",
                           "Lake Road (SW 12 Street)",
                           "Bryan Blvd",
                          "NW 71 Ter",
    "N Oakland Forest Dr",
"Some Ave Pines",
"Some Blvd",
"Some 123 Ave")

        City = c("Hollywood",
                 "Plantation",
                "Davie",
                "Hollywood",
                "Davie",
                "Plantation",
               "Parkland",
    "Oakland Park",
"Pembroke Pines",
"Hallandale Beach",
"Pompano Beach")
        
        Zipcode = c("33024",
                    "33317",
                    "33314",
                    "33024",
                    "33325",
                    "33317",
                    "33067",
    "33309",
"33346",
"33365",
"33389")

我如何通过tidyrR中执行此操作?

代码

library(tidyverse)
library(tidyr)

df = Full_Address

df = df %>% tidyr::separate( c("ID", "Street_Address", "City", "Zipcode"), 
                sep =  , extra = "merge")) # stuck at this step.....

请注意,这是让一个城市只有一个名称:不会匹配像New York Los Angeles这样的城市。

data.frame(Full_Address) %>% 
  extract(Full_Address, c("ID", "Street_Address", "City", "Zipcode"), 
          '(\\d+) ([^,]+) (\\w+),\\D+(\\d+)')

     ID           Street_Address       City Zipcode
1  1111               Harding St  Hollywood   33024
2  2222           W Broward Blvd Plantation   33317
3  3333                SW 74 Ave      Davie   33314
4  4444            Thomas Street  Hollywood   33024
5 11111 Lake Road (SW 12 Street)      Davie   33325
6   555               Bryan Blvd Plantation   33317
7  5555                NW 71 Ter   Parkland   33067

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM