简体   繁体   English

php用htmlentities()在源代码中混淆mailto

[英]php obfuscating mailto in source with htmlentities()

I am attempting to display email addresses on a page that function normally in a browser, but are obfuscated in code to hopefully get at least some spam bots to ignore them. 我试图在可在浏览器中正常运行的页面上显示电子邮件地址,但在代码中将其混淆,以希望至少使一些垃圾邮件程序忽略它们。

I have this test code: 我有以下测试代码:

<?php
$email = "fake@test.com";
$mailto = "mailto:" . $email;
?>
<html>
<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head>
<body>
<p>PHP: <a href="<?php echo htmlentities($mailto); ?>"><?php echo htmlentities($email); ?></a></p>
<p>&nbsp;</p>
<p>MANUAL: <a href="&#109;&#x61;&#105;&#108;&#116;&#x6f;&#58;&#102;&#x61;&#x6b;&#101;&#x40;&#x74;&#101;&#x73;&#x74;&#46;&#x63;&#111;&#x6d;">&#x66;&#97;&#107;&#x65;&#64;&#116;&#x65;&#x73;&#116;&#46;&#99;&#x6f;&#x6d;</a></p>
</body>
</html>

Both links look and work fine on the page, but only the 'manual' one is encoded. 这两个链接在页面上看起来都可以正常工作,但是只有“手动”链接被编码。

I'm getting conflicting information from php.net on how htmlentities works. 我从php.net获得有关htmlentities如何工作的冲突信息。

http://php.net/manual/en/function.htmlentities.php http://php.net/manual/zh/function.htmlentities.php

The documentation states that "all characters which have HTML character entity equivalents are translated into these entities." 该文档指出“具有HTML字符实体等效项的所有字符都将转换为这些实体”。 Since all letters in the alphabet HAVE equivalents, I expect every single char to be converted. 由于字母表中的所有字母都具有等效功能,因此我希望每个字符都可以转换。 But in the examples on that page, it demonstrates that basic letters do not get converted. 但是在该页面上的示例中,它演示了基本字母没有被转换。

Further, when I view the source on that page, it does not appear that the php code has worked at all. 此外,当我在该页面上查看源代码时,似乎根本没有显示php代码。 My expectation is that both links appear the same in the code. 我的期望是两个链接在代码中显示相同。 Here is the results of 'view source'. 这是“查看源代码”的结果。

<html>
<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head>
<body>
<p>PHP: <a href="mailto:fake@test.com">fake@test.com</a></p>
<p>&nbsp;</p>
<p>MANUAL: <a href="&#109;&#x61;&#105;&#108;&#116;&#x6f;&#58;&#102;&#x61;&#x6b;&#101;&#x40;&#x74;&#101;&#x73;&#x74;&#46;&#x63;&#111;&#x6d;">&#x66;&#97;&#107;&#x65;&#64;&#116;&#x65;&#x73;&#116;&#46;&#99;&#x6f;&#x6d;</a></p>
</body>
</html>

So it looks like htmlentities() isn't doing anything at all. 因此,看来htmlentities()根本没有做任何事情。 Not even encoding the '@'. 甚至没有编码“ @”。

Should I be adding some flags? 我应该添加一些标志吗? Is there a better way to do this? 有一个更好的方法吗? If I am successful will this even work against the bots or am I wasting my time? 如果我成功了,这甚至可以对抗机器人,还是我在浪费时间?

The misunderstanding may be from http://php.net/manual/en/function.htmlentities.php 误解可能来自http://php.net/manual/en/function.htmlentities.php

This function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities. 除了htmlentities()之外,此函数在所有方面均与htmlspecialchars()相同,所有具有HTML字符实体等效项的字符都将转换为这些实体。

What it really means from http://php.net/manual/en/function.htmlspecialchars.php 它的真正含义来自http://php.net/manual/en/function.htmlspecialchars.php

Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. 某些字符在HTML中具有特殊意义,如果要保留其含义,则应由HTML实体表示。

htmlspecialchars() encodes: & , " , ' , < and > . Check: htmlspecialchars()编码为: &"'<>

print_r(get_html_translation_table(HTML_SPECIALCHARS));

htmlentities() encodes more characters, but only characters that have special significance in HTML . htmlentities()编码更多字符,但是只能编码在HTML具有特殊意义的 字符 Check: 校验:

print_r(get_html_translation_table(HTML_ENTITIES));

You might look at something like this. 您可能会看到类似这样的内容。 I checked it in a link and it worked as expected: 我在一个链接中检查了它,并按预期工作:

$result = preg_replace_callback('/./', function($m) {
                                           return '&#'.ord($m[0]).';';
                                       },
                                       'mailto:fake@test.com');

This replaces each character in a string with &# then the ASCII value of the character and then ; 这会用&#替换字符串中的每个字符,然后是字符的ASCII值,然后是;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM