简体   繁体   English

如何从 WEIRD JSON 响应中获取价值

[英]How to get value from a WEIRD JSON RESPONSE

I was trying to get data from this API link: https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200我试图从这个API链接获取数据: https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200

If you go to the above link you will see a weird JSON response.如果您将 go 转到上述链接,您会看到奇怪的 JSON 响应。 The keys and values are not properly displayed.键和值未正确显示。

I converted the duct response into a list and iterate over it.我将管道响应转换为列表并对其进行迭代。 I got the response but the value against the key is not printing, instead, it's returning None我得到了响应,但是键的值没有打印,而是返回 None

{'Name': None} {'名称':无}

import scrapy import json进口 scrapy 进口 json

class MainSpider(scrapy.Spider):
    name = 'main'
    # allowed_domains = ['longandfoster.com']
    start_urls = ['https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200']

    def parse(self, response):
        # resp = json.loads(response.body)
        resp_list = []
        resp = json.loads(response.body)
        resp_list.append(resp)

        for each in resp_list:
            name = each.get('DisplayName')

            yield {
                "Name": name,
            }

You have to use json.loads() two times您必须使用json.loads()两次

 resp = json.loads( json.loads(response.body)['Entity'] )

and then your code works.然后你的代码工作。


Minimal working code which you can put in one file and run python script.py without creating project.您可以将其放入一个文件并运行python script.py而无需创建项目的最小工作代码。

import scrapy
import json


class MainSpider(scrapy.Spider):
    
    name = 'main'
    # allowed_domains = ['longandfoster.com']
    start_urls = ['https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200']

    def parse(self, response):
        resp = json.loads(json.loads(response.body)['Entity'])
        for each in resp:
            name = each.get('DisplayName')

            yield {
                "Name": name,
            }

# --- run without project and save in `output.csv` ---

from scrapy.crawler import CrawlerProcess

c = CrawlerProcess({
    'USER_AGENT': 'Mozilla/5.0',
    # save in file CSV, JSON or XML
    'FEED_FORMAT': 'csv',     # csv, json, xml
    'FEED_URI': 'output.csv', #
})
c.crawl(MainSpider)
c.start() 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM