I'm trying to parse XML files from a SharePoint form library where the user has copy/pasted formatted Word document text into a text field. The result is XML inside of XML. I had trouble just getting the contents but with help in another question this syntax worked xpath(xml(outputs('Get_file_content')?['body']),'//*[local-name()="myFields"]//following-sibling::*[local-name()="Request_Description"]')[0]
. The result is something like this
<my:Request_Description xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2017-05-05T14:19:13">
<xhtml:html xml:space="preserve" xmlns="http://www.w3.org/1999/xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xhtml:div>
<xhtml:font size="1" face="CIDFont+F6">
<xhtml:font size="1" face="CIDFont+F6">
<xhtml:p>This is where the request description goes and the result we want</xhtml:p>
</xhtml:font>
</xhtml:font>
</xhtml:div>
</xhtml:html>
</my:Request_Description>
How do I go about just extracting the text for the description? I'm wondering if my first xpath
statement needs to be adjusted so as to not pull back the entire element.
UPDATE - I failed to mention that the above was just one example of the user input to that field and each form will be different. For example, here is another example of what can be found in that field.
<my:Request_Description xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2017-05-05T14:19:13">
<xhtml:html xml:space="preserve" xmlns="http://www.w3.org/1999/xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xhtml:div>test random double quote inside title "here" test and carriage<xhtml:br />return</xhtml:div>
</xhtml:html>
</my:Request_Description>
This is caused by a RTF control on the form where the user can enter into a textbox on the form and the control converts that to the XML you see. Since there is no consistency I'm wondering if using xpath
is not a viable option but I'm not sure what else could be done.
You can use this expression:
xpath(xml(outputs('Get_file_content')?['body']), 'string(/*[local-name()="Request_Description"]/*[local-name()="html"]/*[local-name()="div"]/*[local-name()="font"]/*[local-name()="font"]/*[local-name()="p"])')
You can refer to this official document to refer to the specific usage of xpath
.
========================update===========================
You can use trim , then use this expression:
trim(xpath(xml(outputs('Get_file_content')?['body']), 'string(/*[local-name()="Request_Description"])'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.