简体   繁体   English

如何从 scrapy 中的 json 中删除转义字符?

[英]How cam I remove escape characters from json in scrapy?

I have a json file that has escape characters in some json field, so how can I remove the escape characters, here is how my json data looks like:我有一个 json 文件,该文件在一些 json 字段中有转义字符,那么如何删除转义字符,这是我的 json 数据的样子:

{"url": "www.expamle/com", "name": "\n\t\t\t\t\t\tHisense 49\" FHD TV 49B5200PT 49B5200PT", "price": 
"R5,499.00", "brand": "\n\t\t\t\t\t\tHisense"}

here is my python parse method:这是我的 python 解析方法:

    def parse(self, response):
    for tv in response.xpath(".//div[@class='product-tile-inner']"):
        yield{
            'url' : tv.xpath(".//a[@class='product-tile-inner__img js- 
      gtmProductLinkClickEvent']/@href").get(),
            'name' : tv.xpath(".//a[@class='product-tile-inner__img js- 
      gtmProductLinkClickEvent']/@title").get(),
            'price' : tv.xpath(".//p[@class='col-xs-12 price ONPROMOTION']/text()").get(),
            'img' : tv.xpath(".//a[@class='product-tile-inner__img js- 
         gtmProductLinkClickEvent']//@src").get()


       }

You need to strip() fields which contain spaces:您需要strip()包含空格的字段:

def parse(self, response):
    for tv in response.xpath(".//div[@class='product-tile-inner']"):
        url = tv.xpath(".//a[@class='product-tile-inner__img js-tmProductLinkClickEvent']/@href").get()
        name = tv.xpath(".//a[@class='product-tile-inner__img js-gtmProductLinkClickEvent']/@title").get()
        price = tv.xpath(".//p[@class='col-xs-12 price ONPROMOTION']/text()").get()
        img = tv.xpath(".//a[@class='product-tile-inner__img js-gtmProductLinkClickEvent']//@src").get()
        yield {
            'url': url.strip() if url else url,
            'name': name.strip() if name else name,
            'price': price.strip() if price else price,
            'img': img.strip() if img else img
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM