简体   繁体   中英

R geographic address validation

I am trying to calculate physical distances between geographic locations (addresses) with ggmaps/mapdist function in R. Apart from the uncomfortable fact that Google Maps allows only 2500 queries/session, I have to cope with the misspelled or other way imperfect "addresses". The most typical problem is that the exact address strings themselves are added by several other info (floor, door etc.), but it is very problematic to detect any pattern in these what would allow applying regular expression.

My goal is:

  1. Check if the address string is recognizable to Google Maps;
  2. If not, find a way to truncate to an acceptable form, perhaps by parsing words step by step from the string.

Have anybody coped with this kind of problem?

Thanks.

There are a couple of factors running into each other here. One factor is the misspellings and other complexities related to addresses and the other is pinpointing (geocoding) a given address. Although they are related problems, each must be handled to accomplish your objectives.

There are numerous service providers out there that can do either or both with minimal cost involved. This can be found with a simple Google search. You can then investigate each to see if they match your use case and licensing requirements.

All of that considered, you'll want to get your address list cleaned up on a minimum. Doing that will enable you to utilize any number of geocoding providers.

Depending upon the size of your list, you can get your list cleaned up and geocoded for perhaps $20.

In the interest of full disclosure, I'm the founder of SmartyStreets . We provide a web interface (to help clean up the address list) as well as an API (which can be used on a continual basis to keep addresses clean). We also geocode your list at no extra charge. Further, we don't have any licensing restrictions on the number of lookups that can be performed during a given timeframe. (We have customers that hit us hundreds of millions of times per day.) The entire process of signing up and cleaning up your list takes just a few minutes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM