简体   繁体   English

PHP:对象上的递归htmlspecialchars

[英]PHP: Recursive htmlspecialchars on object

I want to establish a generic sanitizer for my data that comes from various sources. 我想为来自各种来源的数据建立一个通用的消毒剂。 With sanitizing I mean (at this stage) applying htmlspecialchars to strings. 通过消毒我的意思是(在这个阶段)将htmlspecialchars应用于字符串。 Now, the data that comes from these sources can be anything from an object to an array to a string, all nested (and complicated), and the format is always a bit different. 现在,来自这些源的数据可以是从对象到数组到字符串,所有嵌套(和复杂)的数据,并且格式总是有点不同。

So I thought of a recursive htmlspecialchars function that applies itself to arrays and objects, and only applies htmlspecialchars to strings, but how do I walk an object recursively? 所以我想到了一个递归的htmlspecialchars函数,它将自身应用于数组和对象,并且只将htmlspecialchars应用于字符串,但是如何递归地遍历对象?

Thanks. 谢谢。

EDIT: I think I should have mentioned this - I am actually building a RIA that relies heavily on JS and JSON for client-server communication. 编辑:我想我应该提到这一点 - 我实际上正在构建一个严重依赖JS和JSON进行客户端 - 服务器通信的RIA。 The only thing the server does is fetching stuff from the database and returning it to the client via JSON, in the following format: 服务器唯一做的就是从数据库中获取东西并通过JSON将其返回给客户端,格式如下:

{"stat":"ok","data":{...}}

Now as I said, data could be anything, not only coming from a DB in the form of strings, but also coming from an XML The workflow to process the JSON is as follows: 正如我所说,数据可以是任何东西,不仅来自字符串形式的数据库,而且来自XML。处理JSON的工作流程如下:

  1. Fetch data from the DB/XML (source encoding is iso-8859-1) 从DB / XML获取数据(源编码是iso-8859-1)
  2. Put them into the "data" array 将它们放入“数据”数组中

  3. Recursively convert from iso-8859-1 to utf-8 using 使用递归转换从iso-8859-1到utf-8

     private function utf8_encode_deep(&$input) { if (is_string($input)) { $input = $this -> str_encode_utf8($input); } else if (is_array($input)) { foreach ($input as &$value) { $this -> utf8_encode_deep($value); } unset($value); } else if (is_object($input)) { $vars = array_keys(get_object_vars($input)); foreach ($vars as $var) { $this -> utf8_encode_deep($input -> $var); } } } 
  4. Use PHP's json_encode to convert the data into JSON 使用PHP的json_encode将数据转换为JSON

  5. Send (echo) the data to the client 将数据发送(回显)到客户端

  6. Render the data using JS (eg putting into a table) 使用JS渲染数据(例如放入表格)

And somewhere in between that, the data should be somehow sanitized (at this stage only htmlspecialchars). 在介于两者之间的某个地方,数据应该以某种方式进行消毒(在这个阶段只有htmlspecialchars)。 Now the question should have been: Where to sanitize, using what method? 现在的问题应该是: 在哪里消毒,使用什么方法?

You would only want to escape when outputting into HTML. 您只想在输出到HTML时转义。 And you cannot output a complete array or object into HTML, so escaping everything seems invalid. 并且您无法将完整的数组或对象输出到HTML中,因此转义所有内容似乎无效。

You have one level of indirection because of your JSON output. 由于您的JSON输出,您有一个间接级别。 So you cannot decide in PHP what context the data is used for - JSON is still plain text, not HTML. 所以你不能在PHP中决定数据的用途是什么--JSON仍然是纯文本,而不是HTML。

So to decide whether any data inside the JSON must be escaped for HTML we must know how your Javascript is using the JSON data. 因此,要确定是否必须为HTML转义JSON中的任何数据,我们必须知道您的Javascript如何使用JSON数据。

Example: If your JSON is seen as plain text, and contains something like <b>BOLD</b> , then the expected outcome when used inside any HTML is exactly this text, including the chars that look like HTML tags, but no bold typesetting. 示例:如果您的JSON被视为纯文本,并且包含<b>BOLD</b> ,则在任何HTML中使用的预期结果正是此文本,包括看起来像HTML标记的字符,但没有粗体排版。 This will only happen if your Javascript client handles this test as plain text, eg it DOES NOT use innerHTML() to place it on the page, because that would activate the HTML tags, but only innerText() or textContent() or any other convenience method in eg jQuery ( .text() ). 这只会在您的Javascript客户端以纯文本处理此测试时发生,例如它不使用innerHTML()将其放在页面上,因为这会激活HTML标记,但只能激活innerText()textContent()或任何其他例如jQuery( .text() )中的便捷方法。

If on the other hand you expect the JSON to include readymade HTML that is fed into innerHTML() , then you have to escape this string before it is put into JSON. 另一方面,如果您希望JSON包含提供给innerHTML()现成HTML,那么您必须在将此字符串放入JSON之前将其转义。 BUT you must escape the whole string only if you do not want to add any formatting to it. 但是,只有在您不想为其添加任何格式时,才必须转义整个字符串。 Otherwise you are in a situation that uses templates for mixing predefined formatting with user content: The user content has to be escaped when put into HTML context, but the result must not - otherwise Javascript cannot put it into innerHTML() and enable the formatting. 否则,您处于使用模板将预定义格式与用户内容混合的情况:用户内容在放入HTML上下文时必须进行转义,但结果不得 - 否则Javascript无法将其放入innerHTML()并启用格式化。

Basically a global escaping for everything inside your array or object most likely is wrong, unless you know for everything that it will be used in a HTML context by your Javascript. 基本上,对于数组或对象内部的所有内容的全局转义很可能是错误的,除非您知道它将在您的Javascript中在HTML上下文中使用的所有内容。

You can try the following 您可以尝试以下方法

class MyClass {
    public $var1 = '<b>value 1</b>';
    public $var2 = '<b>value 2</b>';
    public $var3 = array('<b>value 3</b>');
}

$list = array();
$list[0]['nice'] = range("A", "C");
$list[0]['bad'] = array("<div>A</div>","<div>B</div>","<div>C</div>",new MyClass());
$list["<b>gloo</b>"] = array(new MyClass(),"<b>WOW</b>");

var_dump(__htmlspecialchars($list));

Function Used 使用的功能

function __htmlspecialchars($data) {
    if (is_array($data)) {
        foreach ( $data as $key => $value ) {
            $data[htmlspecialchars($key)] = __htmlspecialchars($value);
        }
    } else if (is_object($data)) {
        $values = get_class_vars(get_class($data));
        foreach ( $values as $key => $value ) {
            $data->{htmlspecialchars($key)} = __htmlspecialchars($value);
        }
    } else {
        $data = htmlspecialchars($data);
    }
    return $data;
}

Output Something like 输出像

array
  0 => 
    array
      'nice' => 
        array
          0 => string 'A' (length=1)
          1 => string 'B' (length=1)
          2 => string 'C' (length=1)
      'bad' => 
        array
          0 => string '&lt;div&gt;A&lt;/div&gt;' (length=24)
          1 => string '&lt;div&gt;B&lt;/div&gt;' (length=24)
          2 => string '&lt;div&gt;C&lt;/div&gt;' (length=24)
          3 => 
            object(MyClass)[1]
              ...


    array
      0 => 
        object(MyClass)[2]
          public 'var1' => string '&lt;b&gt;value 1&lt;/b&gt;' (length=26)
          public 'var2' => string '&lt;b&gt;value 2&lt;/b&gt;' (length=26)
          public 'var3' => 
            array
              ...
function htmlrecursive($data){
    if (is_array($data) && count($data) > 1){
        foreach ($data as &$d){
            $d = htmlrecursive($d);
        }
    } else if (!is_array($data)){
        return htmlspecialchars($data);
    }
    else {
         return htmlspecialchars($data[0])
    }
}

htmlrecursive($array);

For objects you need to implement The ArrayAccess interface then you can do a array walk recursive 对于需要实现ArrayAccess接口的对象,您可以执行数组遍历递归

Also check this question Getting an object to work with array_walk_recursive in PHP 另请检查此问题在PHP中使用array_walk_recursive获取对象

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM