[英]Create a solution to automatically split addresses into their separate components using python
I am trying to find a solution for being able to automatically split address into their separate components using python. below is some sample data我正在尝试找到一种解决方案,以便能够使用 python 将地址自动拆分为单独的组件。下面是一些示例数据
Full Address![]() |
Street Number![]() |
Street![]() |
City![]() |
State ![]() |
Zip Code ![]() |
---|---|---|---|---|---|
661 Camel Back Road Tulsa Oklahoma 74120 ![]() |
661 ![]() |
Camel Back Road![]() |
Tulsa![]() |
Oklahoma![]() |
|
68 Gnatty Creek Road Roslyn New York 11576 ![]() |
68 ![]() |
Gnatty Creek Road![]() |
Roslyn![]() |
New York![]() |
|
1 Raccoon Run Seattle Washington 98119 ![]() |
1 ![]() |
Raccoon Run![]() |
Seattle![]() |
Washington![]() |
|
616 Friendship Lane Santa Clara California 95054 ![]() |
616 ![]() |
Friendship Lane![]() |
Santa Clara![]() |
California![]() |
95054 ![]() |
3878 Grand Avenue Maitland Florida 32751![]() |
3878 ![]() |
Grand Avenue![]() |
Maitland![]() |
Florida![]() |
32751 ![]() |
The above data is a representation of what I am trying to achieve.上面的数据代表了我正在努力实现的目标。 on the left is my input address, and on the right is the result after having being split out automatically.
左边是我输入的地址,右边是自动拆分后的结果。 The problem here, as cannot be seen in this over simplified example, is that the input addresses don't come in the same order, and will include components such as names of buildings etc.
这里的问题,在这个过度简化的示例中看不到,是输入地址的顺序不同,并且将包括建筑物名称等组件。
My options so far are the following:到目前为止,我的选择如下:
The REGEX option is familiar, but it will still be largely inaccurate. REGEX 选项很熟悉,但在很大程度上仍然不准确。 I need this solution to be as accurate as possible.
我需要这个解决方案尽可能准确。
The MACHINE LEARNING MODEL option is more difficult in that I am not aware of any model or framework capable of classifying multiple categories as once.机器学习 MODEL 选项更难,因为我不知道有任何 model 或框架能够将多个类别分类为一次。 Can anyone help?
谁能帮忙?
so far I haven't really started the REGEX in anticipation of major gaps in capturing groups.到目前为止,我还没有真正开始 REGEX,因为我预计在捕获组方面存在重大差距。
I think the only way to do this and get a fairly accurate result is to get the list of zip codes, for instance from here: https://www.zipcode.com.ng/2022/06/list-of-5-digit-zip-codes-united-states.html?m=1 and a list of US cities.我认为做到这一点并获得相当准确结果的唯一方法是获取 zip 代码的列表,例如从这里获取: https://www.zipcode.com.ng/2022/06/list-of-5- digit-zip-codes-united-states.html?m=1和美国城市列表。
Then you can match the zip code, state and city to the lists.然后您可以将 zip 代码、state 和城市匹配到列表中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.