简体   繁体   English

Google Geocode:PHP实现-字符编码问题

[英]Google Geocode: PHP Implementation - character encoding issues

I'm working with UK address data and also International address data. 我正在使用英国地址数据以及国际地址数据。

I need to geocode the address data for use on a google map. 我需要对地址数据进行地理编码,以便在Google地图上使用。 I'm doing this using the HTTP service. 我正在使用HTTP服务进行此操作。 Ie/ Constructing a query string and passing it to file_get_contents($THEURL). 即/构造查询字符串并将其传递给file_get_contents($ THEURL)。

I've managed to geocode 80% of the address data perfectly, however those addresses in countries like Norway and Sweeden that contain special characters will not return a geocode.The code returned is 602 (cannot find an address). 我已经成功地对80%的地址数据进行了地理编码,但是在挪威和瑞典这样的包含特殊字符的国家/地区,这些地址不会返回地理编码。返回的代码为602(找不到地址)。

Looking into the documentation I can see that the string sent to google must be UTF8 encoded. 查看文档,我可以看到发送到google的字符串必须是UTF8编码的。

I've tried the following to ensure the string is UTF8 encoded / remove the special characters. 我尝试了以下操作,以确保字符串是UTF8编码的/删除了特殊字符。

1) Using UTF8 encode on the query string - this often results in malformed characters being displayed on the screen. 1)在查询字符串上使用UTF8编码-这通常会导致屏幕上显示格式错误的字符。

2) mb_check_encoding reports the string is correctly encoded. 2)mb_check_encoding报告字符串已正确编码。

3) Using a function to substitue special characters for thier europiene eqivilents (in the hope google api will compensate. 3)使用功能代替欧洲字母eqivilents的特殊字符(希望google API可以补偿。

Can anyone suggest a reason why my method isn't working (whether to do with encoding or not?). 谁能说出我的方法不起作用的原因(是否与编码有关?)。

You need to systematically go through every encoding aspect in your system and define what encoding it is in. Mb_detect_encoding and guesswork are not a good approach here. 您需要系统地遍历系统中的每个编码方面,并定义其所采用的编码。在这里, Mb_detect_encoding和猜测工作不是一个好方法。

You need to check the encoding of: 您需要检查以下内容的编码:

  • incoming data 传入数据
    • pages 页数
    • GET parameters GET参数
    • database connection 数据库连接
    • database table collations 数据库表排序规则
  • the script files you work with 您使用的脚本文件

If malformed characters occur, chances are you are using ISO-8859-1 or some other non-UTF-8 encoding somewhere. 如果出现格式错误的字符,则可能是您在某处使用ISO-8859-1或其他一些非UTF-8编码。 When everything is clean UTF-8, the request should go through. 当一切都干净的UTF-8时,请求应通过。

A very good article on the basics is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) . 关于基础知识的一篇很好的文章是每个软件开发人员绝对绝对肯定要了解Unicode和字符集(绝对没有借口!)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM