使用strip（）刪除空白

Question

我怎樣才能刪除[u'\\n\\n\\n result here \\n\\n\\n']並只得到[u'result here']的結果...我正在使用scrapy

def parse_items(self, response):
  str = ""
  hxs = HtmlXPathSelector(response)

  for titles in titles:
      item = CraigslistSampleItem()
      item ["job_id"] = (id.select('text()').extract() #ok
      items.append(item)
  return(items)
end

誰能幫我？

Answer 1

id.select('text()').extract()

返回包含您的文本的字符串列表。 您應該遍歷該列表以剝離每個項目，或者使用切片例如your_list [0] .strip（）進行剝離空白。 Strip方法實際上與字符串數據類型相關聯。

def parse_items(self, response):
  str = ""
  hxs = HtmlXPathSelector(response)

  for titles in titles:
      item = CraigslistSampleItem()
      item ["job_id"] = id.select('text()').extract()[0].strip() #this should work if #there is some string data available. otherwise it will give an index out of range error.
      items.append(item)
  return(items)
end

Answer 2

替代使用Python的.strip()

您可以在選擇“ job_id”的XPath表達式周圍使用XPath函數normalize-space() ：

def parse_items(self, response):
    hxs = HtmlXPathSelector(response)

    for titles in titles:
        item = CraigslistSampleItem()
        item ["job_id"] = title.select('normalize-space(.//td[@scope="row"])').extract()[0].strip()
        items.append(item)
    return(items)

注意1 ：我使用的XPath表達式基於https://careers-cooperhealth.icims.com/jobs/search?ss=1&searchLocation=&searchCategory=&hashed=0

注解2中使用.strip()的答案 ：與id.select('text()').extract()[0].strip()您得到的u'result here' ，而不是列表。

可能正是您所需要的，但是如果要保留列表，則要求刪除[u'\\n\\n\\n result here \\n\\n\\n']並得到結果為[u'result here'] ，您可以使用Python的map()使用類似的內容：

item ["job_id"] = map(unicode.strip, id.select('text()').extract())

使用strip（）刪除空白

問題描述

2 個解決方案

解決方案1
3 已采納 2013-08-28 06:14:46

解決方案2
3 2013-08-28 07:42:58

使用strip（）刪除空白

問題描述

2 個解決方案

解決方案1 3 已采納 2013-08-28 06:14:46

解決方案2 3 2013-08-28 07:42:58

解決方案1
3 已采納 2013-08-28 06:14:46

解決方案2
3 2013-08-28 07:42:58