简体   繁体   English

我怎么能在PHP中编码和解码来自IDN的网址?

[英]How can i code and decode urls from IDN in php?

im doing a site to check, register, etc of domains, i have to make it IDN compliant. 我正在做一个网站检查,注册等域名,我必须使其符合IDN。 Right now i have something like this: 现在我有这样的事情:

echo $domain;       
$domain = idn_to_ascii($domain);
echo $domain;
$domain = idn_to_utf8($domain);
echo $domain;

and im getting this: 我得到这个:

testing123ásd123 xn--testing123sd123-wjb testing123ĂĄsd123 testing123ásd123xn - testing123sd123-wjbtesting123ĂĄsd123

as you can see the decoded string isnt the same as the original i also tried using a class by http://phlymail.com/en/downloads/idna/download/ to do it and im getting the same results 你可以看到解码后的字符串与原来的相同我也尝试使用http://phlymail.com/en/downloads/idna/download/上的类来完成它并获得相同的结果

i have tried using: 我尝试过使用:

$charset="UTF-8";
echo $domain;       
$domain = idn_to_ascii($domain, $charset);
echo $domain;
$domain = idn_to_utf8($domain);
echo $domain;

and i got exactly the same (except that the encoded string is slightly different) 我得到完全相同(除了编码的字符串略有不同)

any ideas? 有任何想法吗?

EDIT: Problem solved! 编辑:问题解决了! with this Problem in converting string to puny code (in PHP, using phlyLabs's punycode string converter) the original string was in iso-8859-2 and the decoded in UTF-8, now i need to find how to make it iso-8859-2 again but google can help me with that. 有这个问题在将字符串转换为微弱的代码(在PHP中,使用phlyLabs的punycode字符串转换器)原始字符串在iso-8859-2中并在UTF-8中解码,现在我需要找到如何使其成为iso-8859- 2再次,但谷歌可以帮助我。 Any mods? 任何mods? what should i do with the question? 该怎么办? close it, erase it? 关闭它,擦除它? leave it this way? 这样离开?

As you already point out, ĂĄ appears to be the UTF8 representation of the á character as displayed in a non-UTF8 document. 正如您已经指出的那样, ĂĄ似乎是非UTF8文档中显示的á字符的UTF8表示。

You can use iconv() to convert between charsets. 您可以使用iconv()在字符集之间进行转换。 However, be aware that charsets that are not Unicode cannot represent the full set of international characters so must convert missing chars to HTML entities. 但是,请注意,非Unicode的字符集不能表示完整的国际字符集,因此必须将缺少的字符转换为HTML实体。 Eg: 例如:

<?php

$domain = idn_to_utf8($domain);
echo htmlentities($domain, ENT_COMPAT, 'UTF-8');

?>

In any case, it'd probably be easier to just use UTF-8 for the whole project. 无论如何,在整个项目中使用UTF-8可能更容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM