简体   繁体   English

PHP的行为改变了吗?

[英]Has PHP's behaviour changed?

Studying for the ZEND-CE exam, I came across this question: 在学习ZEND-CE考试时,我遇到了一个问题:

Given a php.ini setting of: 给定php.ini的设置:
default_charset = utf-8 default_charset = utf-8
What will the following code print in the browser? 以下代码将在浏览器中打印什么?

 <?php header('Content-Type: text/html; charset=iso-8859-1'); echo '&#9986;&#10004;&#10013;'; ?> 

A. Garbled data A.乱码数据
B. & # 9986 ; B.&#9986; & # 10004 ; &#10004; & # 10013 ; &#10013;
C. A blank line due to charset mismatch C.由于字符集不匹配而出现空白行

The expected answer is C, I expected it to be A - and when I ran that code, I got garbled data (Answer A)! 预期的答案是C,我希望它是A-运行该代码时,我得到了乱码的数据(答案A)! So I wonder if PHPs behaviour had been changed recently or if this is an error in the test? 因此,我想知道PHP的行为是否最近已更改,或者这是否是测试中的错误?

I am not aware that PHP behaviour has changed in that respect. 我不知道PHP行为在这方面已经改变。 However, the HTML standard has changed. 但是,HTML标准已更改。

Prior to HTML 4, numeric character references such as &#9986; 在HTML 4之前,数字字符引用(例如&#9986; where interpreted with respect to the document character set (which is specified in the Content Type header field). 相对于文档字符集(在“ 内容类型”标头字段中指定)进行解释的位置。 It is reasonable that, as the code point 9986 does not exist in ISO 8859-1, nothing would be printed. 合理的是,由于ISO 8859-1中不存在代码点9986,因此不会打印任何内容。

Since HTML 4, numeric character references are interpreted as Unicode code points. 从HTML 4开始,数字字符引用被解释为Unicode代码点。 So echo '&#9986;&#10004;&#10013;'; 所以echo '&#9986;&#10004;&#10013;'; should print ✂✔✝ regardless of what the content type header field says about the character set. 无论内容类型标题字段对字符集说什么,都应打印✂✔✝ It is reasonable to call ✂✔✝ Garbled data , if one is not familiar with the Unicode Dingbats block. 如果不熟悉Unicode Dingbats块,则称✂✔✝ 乱码为合理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM