如何使用 Python 从 HTTP 响应中提取数据？

Question

I am trying to use the autocompletion of yahoo, I found the link todo it.我正在尝试使用 yahoo 的自动补全功能，我找到了要执行此操作的链接。 To do this I am using request in python, I give the right URL and after I do ".get" I get my response.为此，我在 python 中使用请求，我给出了正确的 URL 并且在执行“.get”之后我得到了回复。 I don´t understand which kind of data is the response.我不明白响应是哪种数据。 Is it data, array, JSON what is, and how to understand the kind of data in python?是data，array，JSON是什么，怎么理解python里面的数据是什么？ How can I extrapolate the single data from this complicated array?如何从这个复杂的数组中推断出单个数据？ I need extract the data for example after the tags: "exchange":"MIL", i need to get MIL "shortname":"MEDIOBANCA", i need Mediobanca How is it possible to do this?例如，我需要在标签之后提取数据：“exchange”：“MIL”，我需要获取 MIL “shortname”：“MEDIOBANCA”，我需要 Mediobanca 怎么可能做到这一点？

r = requests.get(apiurl)
body=r.text

Response:回复：

 {"explains":[],"count":6,"quotes":[{"exchange":"MIL","shortname":"MEDIOBANCA","quoteType":"EQUITY","symbol":"MB.MI","index":"quotes","score":20129.0,"typeDisp":"Equity","longname":"Mediobanca Banca di Credito Finanziario S.p.A.","isYahooFinance":true},{"exchange":"PNK","shortname":"MEDIOBANCA DI CREDITO FINANZ SP","quoteType":"EQUITY","symbol":"MDIBY","index":"quotes","score":20020.0,"typeDisp":"Equity","longname":"Mediobanca Banca di Credito Finanziario S.p.A.","isYahooFinance":true},{"exchange":"FRA","shortname":"MEDIOBCA  EO 0,50","quoteType":"EQUITY","symbol":"ME9.F","index":"quotes","score":20011.0,"typeDisp":"Equity","longname":"Mediobanca Banca di Credito Finanziario S.p.A.","isYahooFinance":true},{"exchange":"VIE","shortname":"MEDIOBANCA SPA","quoteType":"EQUITY","symbol":"MB.VI","index":"quotes","score":20001.0,"typeDisp":"Equity","longname":"Mediobanca Banca di Credito Finanziario S.p.A.","isYahooFinance":true},{"exchange":"IOB","shortname":"MEDIOBANCA BANCA DI CREDITO FIN","quoteType":"EQUITY","symbol":"0HBF.IL","index":"quotes","score":20001.0,"typeDisp":"Equity","isYahooFinance":true},{"exchange":"STU","shortname":"MEDIOBANCA - BCA CRED.FIN. SPAA","quoteType":"EQUITY","symbol":"ME9.SG","index":"quotes","score":20001.0,"typeDisp":"Equity","isYahooFinance":true}],"news":[],"nav":[],"lists":[],"researchReports":[],"totalTime":19,"timeTakenForQuotes":411,"timeTakenForNews":700,"timeTakenForAlgowatchlist":400,"timeTakenForPredefinedScreener":400,"timeTakenForCrunchbase":0,"timeTakenForNav":400,"timeTakenForResearchReports":0}

Updates:更新：

    list_a = ["mediob"]
list_b = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "l", "m", "n", "o", "p", "q", "r", "s", "t", "v", "z",
           "ü", "ä", "ö", "y", "w", "x"] 
list_c = [f"{i} {j}" for i in list_a for j in list_b]
               
for x in list_c:
    apiurl = "https://query1.finance.yahoo.com/v1/finance/search?q="+x+"&quotesCount=6&quotesQueryId=tss_match_phrase_query&multiQuoteQueryId=multi_quote_single_token_query&enableNavLinks=true&enableEnhancedTrivialQuery=true" 
    r = requests.get(apiurl)
    data = r.json()
    shortname = data["quotes"][0]["shortname"]
    print(shortname)

it give IndexError: list index out of range它给出了 IndexError: list index out of range

Answer 1

Well, first off your URLs are not correct.好吧，首先你的网址不正确。 There should be no space here f"{i} {j}" for i in list_a for j in list_b .这里应该没有空格f"{i} {j}" for i in list_a for j in list_b 。 You just have one URL.您只有一个 URL。 It should be [f"{i}{j}" for i in list_a for j in list_b] .Now, the urls generated will be different and we can succesfully scrape the data..for eg它应该是[f"{i}{j}" for i in list_a for j in list_b] 。现在，生成的 url 会有所不同，我们可以成功抓取数据..例如

list_c = [f"{i}{j}" for i in list_a for j in list_b]

for x in list_c:
    apiurl = "https://query1.finance.yahoo.com/v1/finance/search?q="+x+"&quotesCount=6&quotesQueryId=tss_match_phrase_query&multiQuoteQueryId=multi_quote_single_token_query&enableNavLinks=true&enableEnhancedTrivialQuery=true"
    r = requests.get(apiurl)
    data = r.json()
    if data["quotes"]:
        shortname = data["quotes"][0]["score"]
        print(shortname)

Output:- Output：-

Or for Shortname:- shortname = data["quotes"][0]["shortname"]或者对于 Shortname：- shortname = data["quotes"][0]["shortname"]

MEDIOBANCA
MEDIOBANCA
MEDIOBCA  EO 0,50
MEDIOBANCA
MEDIOBANCA
MEDIOBANCA

Answer 2

 import requests

list_a = ["mediob"]
list_b = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "l", "m", "n", "o", "p", "q", "r", "s", "t", "v", "z",
           "ü", "ä", "ö", "y", "w", "x"] 
list_c = [f"{i} {j}" for i in list_a for j in list_b]
for x in list_c:
    apiurl = "https://query1.finance.yahoo.com/v1/finance/search?q="+x+"&quotesCount=6&quotesQueryId=tss_match_phrase_query&multiQuoteQueryId=multi_quote_single_token_query&enableNavLinks=true&enableEnhancedTrivialQuery=true" 
    r = requests.get(apiurl)
    data = r.json()
    if data['quotes']:
        print(data["quotes"][0]["shortname"])

I took the sample response you provided and created a mock api to simulate what you are doing.我采用了您提供的示例响应并创建了一个模拟 api 来模拟您在做什么。 The response you get back is basically a json response.您得到的响应基本上是 json 响应。

I also see that you have tried the above and are getting an error.我还看到您已经尝试了上述方法并且遇到了错误。 The reason is because when some lists are empty so you need to make sure list is not empty before attempting to print print(data["quotes"][0]["shortname"]) hence we have that if statement.原因是当某些列表为空时，您需要在尝试打印print(data["quotes"][0]["shortname"])之前确保列表不为空，因此我们有 if 语句。

如何使用 Python 从 HTTP 响应中提取数据？

问题描述

2 个解决方案

解决方案1
1 2020-11-28 18:29:41

解决方案2
0 已采纳 2020-11-28 17:39:29

如何使用 Python 从 HTTP 响应中提取数据？

问题描述

2 个解决方案

解决方案1 1 2020-11-28 18:29:41

解决方案2 0 已采纳 2020-11-28 17:39:29

解决方案1
1 2020-11-28 18:29:41

解决方案2
0 已采纳 2020-11-28 17:39:29