I am writing a scrapy spider and I want the user to be able to supply an html tag like <span class="someclass"></span>
or <a style="somestuff"></a>
and then use these tags to extract the text in betwen and put that in my results. I really don't want the user to have to supply Xpath. I understand it may be easier to code with xpath but I will make my spider available to users who are not so tech savvy.
How would you do that?
Have a look at this
http://django-dynamic-scraper.readthedocs.org/en/latest/
I have tried it works good and you can link with django models as well.
You can get many ideas from there , how to take user input
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.