如何在python scrapy中刪除數組中項目的第一個字符

Question

我正在嘗試刪除數組中某個項目的前7個字符，更具體地說，我正在嘗試刪除“ mailto”，因此它將僅顯示電子郵件

我以為使用[：7]可以解決問題，但是python忽略了該請求。

有什么建議么？

def businessprofile(self, response):
    for business in response.css('header#main-header'):
        item = Item()
        item['business_name'] = business.css('div.sales-info h1::text').extract()
        item['website'] = business.css('a.secondary-btn.website-link::attr(href)').extract()
        # i want to remove the first 7 characters "mailto:", but not sure how ? i made an attempt
        item['email'] = business.css('a.email-business::attr(href)').extract()[7:]
        item['phonenumber'] = business.css('p.phone::text').extract_first()
        for x in item['business_name']:
            #new code here, call to self.seen_business_names
            if x not in self.seen_business_names:
                if item['business_name']:
                    if item['phonenumber']:
                        if item['email']:                               
                            yield item
                            self.seen_business_names.append(x)

這是我需要刪除字符的地方

   item['email'] = business.css('a.email-business::attr(href)').extract()[7:]

Answer 1

顯然business.css('a.email-business::attr(href)').extract()返回一個列表。 您需要從列表中的項目中刪除mailto:

s = business.css('a.email-business::attr(href)').extract()
item['email'] = [item[7:] for item in s]
# ['businessname@gmail.com']

要么

s = business.css('a.email-business::attr(href)').extract()
item['email'] = [item.replace('mailto:', '') for item in s]
# ['businessname@gmail.com']

Answer 2

您需要使用[7:]而不是[:7]

語法為[<start>:<end>] ，如果省略，它將自動從字符串的開頭或結尾開始。

例如：

val = "mailto:abc@abc.de"
mailto = val[:7] # from first charater to 7th = 'mailto:'
email = val[7:] # 8th character to the end.

Answer 3

計數從0開始：

a = "0123456789"
a[7:]
# '789'

所以你可能需要

a[8:]
# '89'

如何在python scrapy中刪除數組中項目的第一個字符

問題描述

3 個解決方案

解決方案1
1 已采納 2018-03-10 15:02:41

解決方案2
0 2018-03-10 13:43:42

解決方案3
0 2018-03-10 13:48:59

如何在python scrapy中刪除數組中項目的第一個字符

問題描述

3 個解決方案

解決方案1 1 已采納 2018-03-10 15:02:41

解決方案2 0 2018-03-10 13:43:42

解決方案3 0 2018-03-10 13:48:59

解決方案1
1 已采納 2018-03-10 15:02:41

解決方案2
0 2018-03-10 13:43:42

解決方案3
0 2018-03-10 13:48:59