简体   繁体   中英

Finding different nodes and values from xml using lxml

I work out why lxml will parse part of the xml I'm looking through but not other bits.

The following snippet works and gives me all the titles I need:

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//title/text()")

However, a simple change to

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//itemId/text()")

or

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//globalId/text()")

just returns prices as number of empty results.

XML given below...

<findItemsByProductResponse>
  <ack>Success</ack>
  <version>1.12.0</version>
  <timestamp>2013-02-04T13:35:57.106Z</timestamp>
  <searchResult count="31">
    <item>
      <itemId>130842622974</itemId>
      <title>BONES - COMPLETE SEASON 4 - BLURAY</title>
      <globalId>EBAY-US</globalId>
    <primaryCategory>
      <categoryId>617</categoryId>
      <categoryName>DVDs & Blu-ray Discs</categoryName>
    </primaryCategory>
    <galleryURL>
      http://thumbs3.ebaystatic.com/m/mnuTBPOWZ-6F4kIHS1mj3gg/140.jpg
    </galleryURL>
    <viewItemURL>
      http://www.ebay.com/itm/BONES-COMPLETE-SEASON-4-BLURAY-/130842622974?pt=US_DVD_HD_DVD_Blu_ray
    </viewItemURL>
    <productId type="ReferenceID">78523575</productId>
    <paymentMethod>PayPal</paymentMethod>
    <autoPay>false</autoPay>
    <postalCode>60544</postalCode>
    <location>Plainfield,IL,USA</location>
    <country>US</country>
    <shippingInfo>
      <shippingServiceCost currencyId="USD">0.0</shippingServiceCost>
      <shippingType>Free</shippingType>
      <shipToLocations>Worldwide</shipToLocations>
      <expeditedShipping>true</expeditedShipping>
      <oneDayShippingAvailable>false</oneDayShippingAvailable>
      <handlingTime>1</handlingTime>
    </shippingInfo>
    <sellingStatus>
      <currentPrice currencyId="USD">12.99</currentPrice>
      <convertedCurrentPrice currencyId="USD">12.99</convertedCurrentPrice>  
      <sellingState>Active</sellingState>
      <timeLeft>P3DT23H12M7S</timeLeft>
    </sellingStatus>
    <listingInfo>
      <bestOfferEnabled>false</bestOfferEnabled>
      <buyItNowAvailable>false</buyItNowAvailable>
      <startTime>2013-01-29T12:48:04.000Z</startTime>
      <endTime>2013-02-08T12:48:04.000Z</endTime>
      <listingType>FixedPrice</listingType>
      <gift>false</gift>
    </listingInfo>
    <returnsAccepted>true</returnsAccepted>
    <condition>
      <conditionId>1000</conditionId>
      <conditionDisplayName>Brand New</conditionDisplayName>
    </condition>
    <isMultiVariationListing>false</isMultiVariationListing>
    <topRatedListing>true</topRatedListing>
  </item>

ps for added bonus, I'm going to be looking at finding the convertedCurrentPrice as a next step (just thought I should solve one mystery at a time) - the code I was going to use looks something like

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//sellingStatus/convertedCurrentPrice/text()")

Does that seem about right or are there better ways of doing this?

Thanks,

Matt

Try with itemid , because I think lxml converts tags to lowercase. Also why not use:

doc.xpath('.//item/itemid/text()")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM