简体繁体中英

extra data between html tags using scrapy (not xpath)

原文 2013-01-19 08:50:02 6 1 python/ html/ scrapy

I am writing a scrapy spider and I want the user to be able to supply an html tag like <span class="someclass"></span> or <a style="somestuff"></a> and then use these tags to extract the text in betwen and put that in my results. I really don't want the user to have to supply Xpath. I understand it may be easier to code with xpath but I will make my spider available to users who are not so tech savvy.

How would you do that?

1 answers

Have a look at this

http://django-dynamic-scraper.readthedocs.org/en/latest/

I have tried it works good and you can link with django models as well.

You can get many ideas from there , how to take user input

Scrapy 1.0.3 scraped data has <value> tags using xpath and extract()

Scrapy Xpath How should I handle missing data between tags in a table?

How to extract text data from multiple tags using response.XPath in Scrapy?

Scrapy xpath with following sibling between two h2 tags

Selenium / XPath Getting HTML between two tags

Using Css Selectors or xpath to extract data in scrapy

Scrapy xpath between 2 keywords

Using Scrapy Python not able to extract data from response html with xpath due to namespace

Extracting data from HTML table using scrapy: response.xpath() yields None

capturing states between tags in python using xpath

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Scrapy 1.0.3 scraped data has <value> tags using xpath and extract() Scrapy Xpath How should I handle missing data between tags in a table? How to extract text data from multiple tags using response.XPath in Scrapy? Scrapy xpath with following sibling between two h2 tags Selenium / XPath Getting HTML between two tags Using Css Selectors or xpath to extract data in scrapy Scrapy xpath between 2 keywords Using Scrapy Python not able to extract data from response html with xpath due to namespace Extracting data from HTML table using scrapy: response.xpath() yields None capturing states between tags in python using xpath

Related Tags

extra data between html tags using scrapy (not xpath)

Question

1 answers

solution1 0 2013-01-20 02:13:51

solution1
0 2013-01-20 02:13:51