简体   繁体   English

PHP和RSS源和特殊字符验证问题

[英]PHP & RSS Feeds & Special Characters validation Problem

I keep getting the following validation warning below. 我在下面继续收到以下验证警告。 And I was wondering that some of my articles deal with special characters and was wondering how should I go about rendering or not rendering special characters in my RSS feeds? 我想知道我的一些文章涉及特殊字符,并想知道我应该如何渲染或不在我的RSS提要中渲染特殊字符? Should I use htmlentites or not? 我应该使用htmlentites吗? If so how? 如果是这样的话?

In addition, interoperability with the widest range of feed readers could be improved by implementing the following recommendations. 此外,通过实施以下建议,可以改善与最广泛的饲料阅读器的互操作性。 line 22, column 35: title should not contain HTML: & 第22行,第35栏:标题不应包含HTML: &

PHP code. PHP代码。

<title>' . htmlentities(strip_tags($title), ENT_QUOTES, "UTF-8") . '</title>

You should use CDATA To escape characters in your XML feeds, this allows you to use your raw data without disrupting the XML layout. 您应该使用CDATA转义XML提要中的字符,这样您就可以在不中断XML布局的情况下使用原始数据。

Try this: 尝试这个:

<title><![CDATA[ YOUR RAW CONTENT]]></title>

Note: do not use htmlentites and strip_tags as this will escape them for the browser, and any other reader should read them correctly. 注意:不要使用htmlentites和strip_tags,因为这会为浏览器转义它们,任何其他读者都应该正确读取它们。

Qoute from w3schools: 来自w3schools的Qoute:

The term CDATA is used about text data that should not be parsed by the XML parser. 术语CDATA用于不应由XML解析器解析的文本数据。 Characters like "<" and "&" are illegal in XML elements. "<""&"这样的字符在XML元素中是非法的。 "<" will generate an error because the parser interprets it as the start of a new element. "<"将生成错误,因为解析器将其解释为新元素的开头。 "&" will generate an error because the parser interprets it as the start of an character entity. "&"将生成错误,因为解析器将其解释为字符实体的开头。 Some text, like JavaScript code, contains a lot of "<" or "&" characters. 某些文本(如JavaScript代码)包含大量"<""&"字符。 To avoid errors script code can be defined as CDATA. 为避免错误,脚本代码可以定义为CDATA。 Everything inside a CDATA section is ignored by the parser. 解析器会忽略CDATA部分内的所有内容。 A CDATA section starts with "": CDATA部分以“”开头:

http://www.w3schools.com/xml/xml_cdata.asp http://www.w3schools.com/xml/xml_cdata.asp

/* feedvalidator.org (Feedburner recommends this site to validate your feeds) says: "For the widest interop, the RSS Profile recommends the use of the hexadecimal character reference "&" to represent "&" and "<" to represent "<". */ / * feedvalidator.org(Feedburner建议此站点验证您的源)说:“对于最广泛的互操作,RSS配置文件建议使用十六进制字符引用”&“来表示”&“和”<“来表示”< “。* /

        // find title problems
        $find[] = '<';
        $find[] = '\x92';
        $find[] = '\x84';

        // find content problems
        $find_c[] = '\x92';
        $find_c[] = '\x84';
        $find_c[] = '&nbsp;';

        // replace title
        $replace[] = '&#x3C;';
        $replace[] = '&#39;';
        $replace[] = '&#34;';

        // replace content
        $replace_c[] = '&#39;';
        $replace_c[] = '&#34;';
        $replace_c[] = ' ';

        // We don't want to re-replace "&" characters.  
        // So do this first because of PHP "feature" https://bugs.php.net/bug.php?id=33773
        $title = str_replace('&', '&#x26;', $title); 
        $title = str_replace($find, $replace, $title);
        $post_content = str_replace($find_c, $replace_c, $row[3]);

        // http://productforums.google.com/forum/#!topic/merchant-center/nIVyFrJsjpk
        $link = str_replace('&', '&amp;', $link);

Of course I'm doing some pre-processing before $title, $post_content and $link are added to my database. 当然我在$ title,$ post_content和$ link添加到我的数据库之前做了一些预处理。 But this should help solve some common problems to get a valid RSS feed. 但这应该有助于解决一些常见问题,以获得有效的RSS提要。

Update: Fixed the &#x26;#x26;#x26; 更新:修复了&#x26;#x26;#x26; "recursion" problem, see https://bugs.php.net/bug.php?id=33773 “递归”问题,请参阅https://bugs.php.net/bug.php?id=33773

Take out the htmlentities() . 取出htmlentities() It's only for HTML files. 它仅适用于HTML文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM