[英]How to get value from a WEIRD JSON RESPONSE
I was trying to get data from this API
link: https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200我试图从这个
API
链接获取数据: https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200
If you go to the above link you will see a weird JSON response.如果您将 go 转到上述链接,您会看到奇怪的 JSON 响应。 The keys and values are not properly displayed.
键和值未正确显示。
I converted the duct response into a list and iterate over it.我将管道响应转换为列表并对其进行迭代。 I got the response but the value against the key is not printing, instead, it's returning None
我得到了响应,但是键的值没有打印,而是返回 None
{'Name': None}
{'名称':无}
import scrapy import json进口 scrapy 进口 json
class MainSpider(scrapy.Spider):
name = 'main'
# allowed_domains = ['longandfoster.com']
start_urls = ['https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200']
def parse(self, response):
# resp = json.loads(response.body)
resp_list = []
resp = json.loads(response.body)
resp_list.append(resp)
for each in resp_list:
name = each.get('DisplayName')
yield {
"Name": name,
}
You have to use json.loads()
two times您必须使用
json.loads()
两次
resp = json.loads( json.loads(response.body)['Entity'] )
and then your code works.然后你的代码工作。
Minimal working code which you can put in one file and run python script.py
without creating project.您可以将其放入一个文件并运行
python script.py
而无需创建项目的最小工作代码。
import scrapy
import json
class MainSpider(scrapy.Spider):
name = 'main'
# allowed_domains = ['longandfoster.com']
start_urls = ['https://www.longandfoster.com/include/ajax/api.aspx?op=SearchAgents&firstname=&lastname=&page=1&pagesize=200']
def parse(self, response):
resp = json.loads(json.loads(response.body)['Entity'])
for each in resp:
name = each.get('DisplayName')
yield {
"Name": name,
}
# --- run without project and save in `output.csv` ---
from scrapy.crawler import CrawlerProcess
c = CrawlerProcess({
'USER_AGENT': 'Mozilla/5.0',
# save in file CSV, JSON or XML
'FEED_FORMAT': 'csv', # csv, json, xml
'FEED_URI': 'output.csv', #
})
c.crawl(MainSpider)
c.start()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.