scrape data through xpath from div that contains javascript in scrapy python

Question

I am working on scrapy , i am scraping a site and using xpath to scrape items. But some of the div contains javascript , so when i used xpath until the div id that contains javascript code is returning an empty list,and without including that div element(which contains javascript) can able to fetch HTML data

HTML code

<div class="subContent2">    
   <div id="contentDetails">
       <div class="eventDetails">
            <h2>
                <a href="javascript:;" onclick="jdevents.getEvent(117032)">Some data</a>
            </h2>
       </div>
   </div>
</div>

Spider Code

class ExampleSpider(BaseSpider):
    name = "example"
    domain_name = "www.example.com"
    start_urls = ["http://www.example.com/jkl/index.php"]


    def parse(self, response):
         hxs = HtmlXPathSelector(response)
         required_data = hxs.select('//div[@class="subContent2"]/div[@id="contentDetails"]/div[@class="eventDetails"]')

So how can i get text(Some data) from the anchor tag inside the h2 element as mentioned above, is there any alternate way for fetching data from the elements that contains javascript in scrapy

Answer 1

<div class="subContent2">    
   <div id="contentDetails">
       <div class="eventDetails">
            <h2>
                <a href="javascript:;" onclick="jdevents.getEvent(117032)">Some data</a>
            </h2>
       </div>
   </div>
</div>

The problem is not the javascript code in this case to get 'Some data' string.

You need either to get the subnode:

required_data = hxs.select('//div[@class="subContent2"]/div[@id="contentDetails"]/div[@class="eventDetails"]/h2/a/text()')

在此处输入图片说明

or use string function:

required_data = hxs.select('string(//div[@class="subContent2"]/div[@id="contentDetails"]/div[@class="eventDetails"])')

scrape data through xpath from div that contains javascript in scrapy python

Question

1 answers

solution1
2 2012-06-12 13:55:51

scrape data through xpath from div that contains javascript in scrapy python

Question

1 answers

solution1 2 2012-06-12 13:55:51

solution1
2 2012-06-12 13:55:51