簡體   English   中英

從HTML responseText解析JSON

[英]Parse JSON from HTML responseText

Python的webob模塊默認返回text / html響應,特別是ServerErorr的響應,並且最終將錯誤JSON Paylod嵌入HTML響應文本的正文中,其中包含以下內容:

<html>
<head>
  <title>503 Service Unavailable</title>
</head>
<body>
<h1>503 Service Unavailable</h1>
{
    "status": "object-specific error",
    "payload": {
            "Message": "Unable to list resources",
            "HTTP Method": "GET",
            "URI": "api/myManager/1.0/Node",
            "Operation": "LIST",
            "Object": {
                    "Name": "myManager.Node",
                    "Interface": "Node"
            },
            "Version": {
                    "Major": 1,
                    "Minor": 0
            }
       }
}<br /><br />
</body>
</html>

在客戶端使用Javascript提取嵌入在HTML中的JSON的最佳方法是什么? 提取嵌入在HTML中的JSON對象的最佳方法是什么?

因此,我總體上同意,更好的解決方案是確保服務器僅返回JSON,但是如@Barmer建議的那樣,通過客戶端上的Javascript實現此快速方法,將html解析為DOM,將文本childNode放入正文中並在其上運行JSONParse。

var responseStr = '<html>' +
                  '<head>' +
                  '  <title>503 Service Unavailable</title>' +
                  '</head>' +
                  '<body>' +
                  '<h1>503 Service Unavailable</h1>' +
                  '{' +
                  '  "status": "object-specific error",' +
                  '  "payload": {' +
                  '    "Message": "Unable to list resources",' +
                  '    "HTTP Method": "GET",' +
                  '    "URI": "api/myManager/1.0/Node",' +
                  '    "Operation": "LIST",' +
                  '    "Object": {' +
                  '      "Name": "myManager.Node",' +
                  '      "Interface": "Node"' +
                  '    },' +
                  '    "Version": {' +
                  '      "Major": 1,' +
                  '      "Minor": 0' +
                  '    }' +
                  '  }' +
                  '}<br /><br />' +
                  '</body>' +
                  '</html>';
var parser = new DOMParser();
var doc = parser.parseFromString(responseStr, "text/html");
var items = doc.body.getElementsByTagName("*");
var json_obj;

for (var i = 0, len = doc.body.childNodes.length; i < len; i++) {
    if (doc.body.childNodes[i].nodeName == "#text") {
        json_obj = JSON.parse(doc.body.childNodes[i].data);
        break;
    }
}

// You can access json directly now e.g.
console.log(json_obj.status);
console.log(json_obj.payload['HTTP Method']);

使用RegEx解析(不是很可靠,但效率很高)import re import json

content = """\
<html>
<head>
  <title>503 Service Unavailable</title>
</head>
<body>
<h1>503 Service Unavailable</h1>
{
    "status": "object-specific error",
    "payload": {
            "Message": "Unable to list resources",
            "HTTP Method": "GET",
            "URI": "api/myManager/1.0/Node",
            "Operation": "LIST",
            "Object": {
                    "Name": "myManager.Node",
                    "Interface": "Node"
            },
            "Version": {
                    "Major": 1,
                    "Minor": 0
            }
       }
}<br /><br />
</body>
</html>"""

mo = re.search(r"</h1>(.*?)<br", content, flags=re.DOTALL)
if mo:
    data = mo.group(1)
    obj = json.loads(data)
    print(obj)

你會得到:

{'payload': {'Operation': 'LIST', 'HTTP Method': 'GET',
'URI': 'api/myManager/1.0/Node',
'Message': 'Unable to list resources',
'Version': {'Major': 1, 'Minor': 0},
'Object': {'Interface': 'Node', 'Name': 'myManager.Node'}},
'status': 'object-specific error'}

或者,使用lxml

import json
from lxml import etree

content = """\
<html>
...
</html>"""

tree = etree.XML(content)

h1 = tree.xpath("/html/body/h1[1]")[0]
data = h1.tail
obj = json.loads(data)

結果相同

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM