简体   繁体   中英

Normalize space for list items and extract as array using Scrapy

I am looking for an efficient way to extract list items as an array. They need to be stripped of any extra spaces. Currently I am doing this:

actions = []
actions_list = sel.xpath('//div[label="Actions Taken"]/article/div/ul')
action_items = actions_list.xpath('li')
for a in action_items:
    actions.append(a.xpath('normalize-space(text())')[0].extract())

The actions array gets stored in my database. Is there a more efficient way of doing this in Scrapy?

The following xpath should do the same you are doing:

sel.xpath('normalize-space(//div[label="Actions Taken"]/article/div/ul/li/text()[0])').extract()

but it depends on the page

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM