与python中的正则表达式匹配的子字符串

Question

Im trying to get a substring that matches a regex in python, its a price obtained from scraping a supermarket website.我试图获得一个与 python 中的正则表达式匹配的子字符串，它是从抓取超市网站中获得的价格。 My code looks like this:我的代码如下所示：

import scrapy
import re

class namePriceSpider(scrapy.Spider):
    name = 'namePrice'
    start_urls = [
        'https://www.cotodigital3.com.ar/sitios/cdigi/browse/'
    ]

    def parse(self, response):
        all_category_products = response.xpath('//*[@id="products"]')
        for product in all_category_products:
            name = product.xpath('//div[@class="descrip_full"]/text()').extract()
            price = product.xpath('//span[@class ="atg_store_newPrice"]/text()').extract()
            yield {'name': name,
                   'price': re.search(r'$\d{1,3}(?:[.,]\d{3})*(?:[.,]\d{2})', price).group(1)}

When i run the spider, i get this error line 16, in parse 'price': re.search(r'$\\d{1,3}(?:[.,]\\d{3})*(?:[.,]\\d{2})', price).group(1)} and TypeError: expected string or bytes-like object.当我运行蜘蛛时，我在解析 'price': re.search(r'$\\d{1,3}(?:[.,]\\d{3})*(?: [.,]\\d{2})', price).group(1)} 和 TypeError: 预期的字符串或类似字节的对象。

Answer 1

I already solved it, it had multiple errors but the biggest one was that price shouldn't have .extract(), it should be like this我已经解决了，它有多个错误，但最大的一个是价格不应该有 .extract()，它应该是这样的

price = product.xpath('//span[@class="atg_store_productPrice" and not(@style)]/span[@class '
                                  '="atg_store_newPrice"]/text() | //span[@class="price_discount"]/text()').re(
                r'\$\d{'
                r'1,'
                r'5}(?:['
                r'.,'
                r']\d{'
                r'3})*('
                r'?:[., '
                r']\d{2})*')

与python中的正则表达式匹配的子字符串

问题描述

1 个解决方案

解决方案1
0 2020-10-11 15:30:32

与python中的正则表达式匹配的子字符串

问题描述

1 个解决方案

解决方案1 0 2020-10-11 15:30:32

解决方案1
0 2020-10-11 15:30:32