简体   繁体   中英

Parsing XML with encoding to python

I have some python experience not a lot. I have not worked on XML with python but now I have to. I have a XML within a string that I am trying to Parse in Python. I want to store this XML in a dataframe but I am unable to parse it to python.

import lxml.etree as ET
 lz4UC = rs['trade']['uc']
 UC = lz4ToString(base64.b64decode(lz4UC))
 parser = ET.XMLParser(recover=True)
 tree = ET.parse(UC,parser = parser) # option 1
 #tree2 = ET.fromstring(UC,parser = parser) # option 2

Error Message with option 1: OSError: Error reading file '<?xml version="1.0" encoding="UTF-8" standalone="yes"?> Error Message with option 2: ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

UC Looks like:

'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<positionEventMessage xmlns="urn:XXXX:uc" xmlns:td="urn:XXXX:uc:trade-id" xmlns:dt="http://www.dtcc.com/ext" xmlns:ip="urn:XXXX:ipt" xmlns:fpml="http://www.fpml.org/FpML-5/recordkeeping" xmlns:dtx="urn:XXXX:dtcc-5-ext" xmlns:w3="http://www.w3.org/2000/09/xmldsig#" xmlns:XXXX="urn:XXXX:fpml-5-ext" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <header>
        <sourceSystem>RODS</sourceSystem>
        <originatingSystem>MXG2000</originatingSystem>
        <timestamp>2020-07-04T16:23:46Z</timestamp>
    </header>
    <positionEvent>
        <eventType>Position:Update</eventType>
        <businessDate>2020-07-04</businessDate>
        <businessTime>16:23:46.046Z</businessTime>
        <position>
            <primaryAssetClass>Cash</primaryAssetClass>
            <productType productTypeScheme="urn:XXXX:product-type:RODS">ACFACFACF</productType>
            <productType productTypeScheme="urn:XXXX:product-type:RODS:qlDesc">ACF-FXD</productType>
            <owner>
                <partyReference href="Party1"/>
                <accountReference href="Account1"/>
            </owner>
            <aggregationCategory aggregationCategoryScheme="urn:XXXX:aggregation-category:MUREX:instrument">ACF-FXD</aggregationCategory>
            <currencyPair>
                <fpml:currency1>USD</fpml:currency1>
                <fpml:currency2>SAR</fpml:currency2>
            </currencyPair>
            <positionId positionIdScheme="urn:XXXX:position-id:HTI">0000002442892000207911</positionId>
            <positionId positionIdScheme="urn:XXXX:position-id:RODS:regulatory-key">999999999894891</positionId>
            <positionId positionIdScheme="urn:XXXX:position-id:RODS:valuation-id">USDSAR209</positionId>
            <positionId positionIdScheme="urn:XXXX:position-id:RODS:GlobalId">2000207911</positionId>
            <version>20151207000000000</version>
            <fpml:cash>
                <fpml:currency>SAR</fpml:currency>
            </fpml:cash>
            <positionType>Long</positionType>
            <quantity>7426113.8099999996</quantity>
            <internalProductType>
                <ip:productType productName="FX - SIMPLE CASH FLOW"/>
            </internalProductType>
        </position>
    </positionEvent>
    <party id="Party1">
        <fpml:partyId partyIdScheme="urn:XXXX:party-id:PO_ID">PO7</fpml:partyId>
        <fpml:partyId partyIdScheme="urn:XXXX:party-id:PO_GROUP">LOH</fpml:partyId>
        <fpml:partyId partyIdScheme="urn:XXXX:party-id:GROUP_ID">MDBK</fpml:partyId>
        <fpml:partyId partyIdScheme="urn:XXXX:party-id:BRANCH_ID">610</fpml:partyId>
        <fpml:partyId partyIdScheme="urn:XXXX:party-id:GRID_ID">43146</fpml:partyId>
    </party>
    <account id="Account1">
        <fpml:accountId accountIdScheme="urn:XXXX:book-id:RODS">209</fpml:accountId>
        <fpml:accountId accountIdScheme="urn:XXXX:book-id:HMS">FO0025489</fpml:accountId>
        <fpml:accountBeneficiary href="Party1"/>
    </account>
</positionEventMessage>'

Try it this way:

uc = """[your xml above"""]
tree = ET.XML(uc.encode())

and see if that works.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM