简体   繁体   English

哪些PHP代码将帮助我以xml格式解析html数据?

[英]What PHP code will help me parse html data in an xml form?

FalseAWS.MechanicalTurk.XMLParseErrorThere was an error parsing the XML question or answer data in your request. Please make sure the data is well-formed and validates against the appropriate schema. (1284779956270)Array00

I'm trying to send entire emails to mechanical turk, and I am using the mtturk.lib.php library to send this. 我正在尝试将所有电子邮件发送到机械土耳其人,并且我正在使用mtturk.lib.php库进行发送。 I tried urlencode and htmlentities to attempt to send it, I'm sure there's a function that will make this code "formatted well enough" to send it through. 我尝试了urlencode和htmlentities尝试发送它,我确定有一个函数可以使此代码“格式正确”以将其发送出去。

$thequestion = '<a href="linkgoeshere" target="_blank">click here</a>';

$QuestionXML = '<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
  <Question>
    <QuestionContent>
      <Text>'.$thequestion.'</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer/>
    </AnswerSpecification>
  </Question>
</QuestionForm> ';

HTML is not a form of XML; HTML不是XML的一种形式。 don't try to parse it as such. 不要试图这样解析它。 Your best bet is to use a HTML5 parser, or, failing to obtain that, an SGML parser. 最好的选择是使用HTML5解析器,或者如果无法获取,则使用SGML解析器。

html code inside an xml document may be embedded in a number of ways: 可以通过多种方式嵌入xml文档中的html代码:

  1. escape it all the hard way with htmlspecialchars() and send it 使用htmlspecialchars()所有困难的转义并将其发送
  2. escape it with a <![CDATA[ ... ]]> section <![CDATA[ ... ]]>节将其转义
  3. convert it to XHTML, specify the right namespace in an xmlns attribute 将其转换为XHTML,在xmlns属性中指定正确的名称空间

different xml parsers may support or not the third method, I'd go with the first or the second one. 不同的xml解析器可能支持或不支持第三种方法,我会选择第一种或第二种。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM