簡體   English   中英

循環通過多個API鏈接獲取數據? 似乎從一個鏈接帶回數據

[英]Looping through multiple API links to get data? Seems to be bringing back data from one link

語境

所以我刮掉了我在網站上找到的 API,但它只返回 100 個數據點。 我通過以下方式獲得了數據/API 請求:

Chrome 瀏覽器的“檢查”方面 -> 網絡 -> XHR ( http://www.fao.org/faostat/en/?#data/QC )。

有1000個。 我意識到數據只顯示 100 並且在其他頁面中有數據,所以我決定獲取其他頁面的 API url,將它們放在列表中並調用它們。 沒有期待太多,但請求通過了。

然后我將數據轉換為 json 格式,然后轉換為 pandas dataframe。 我檢查了信息,發現只有 100 個,雖然它確實返回了不同的數據集(第 2 頁數據),但原始金額已返回。

所以我想創建一個 function,定義所有 API url,然后為 function 創建一個 for 循環。

它仍然只返回一頁數據。

這是最終代碼:

url = "http://fenixservices.fao.org/faostat/api/v1/en/data/QC?area=81&area_cs=FAO&element=2312%2C2510%2C2413&item=800%2C221%2C711%2C515%2C526%2C226%2C366%2C367%2C572%2C203%2C486%2C44%2C782%2C176%2C414%2C558%2C552%2C216%2C181%2C89%2C358%2C101%2C461%2C426%2C217%2C591%2C125%2C378%2C265%2C393%2C108%2C531%2C530%2C220%2C191%2C459%2C689%2C401%2C693%2C698%2C661%2C249%2C656%2C813%2C195%2C554%2C397%2C550%2C577%2C399%2C821%2C569%2C773%2C94%2C512%2C619%2C542%2C541%2C603%2C406%2C720%2C549%2C103%2C507%2C560%2C242%2C839%2C225%2C777%2C336%2C677%2C277%2C780%2C310%2C263%2C592%2C224%2C407%2C497%2C201%2C372%2C333%2C210%2C56%2C446%2C571%2C809%2C671%2C568%2C299%2C79%2C449%2C292%2C702%2C234%2C75%2C254%2C339%2C430%2C260%2C403%2C402%2C490%2C600%2C534%2C521%2C187%2C417%2C687%2C748%2C587%2C197%2C574%2C223%2C489%2C536%2C296%2C116%2C211%2C394%2C754%2C523%2C92%2C788%2C270%2C547%2C27%2C30%2C149%2C836%2C71%2C280%2C328%2C289%2C789%2C83%2C236%2C723%2C373%2C544%2C423%2C157%2C156%2C161%2C267%2C122%2C305%2C495%2C136%2C667%2C826%2C388%2C97%2C275%2C692%2C463%2C420%2C205%2C222%2C567%2C15%2C137%2C135&item_cs=FAO&year=1961%2C1962%2C1963%2C1964%2C1965%2C1966%2C1967%2C1968%2C1969%2C1970%2C1971%2C1972%2C1973%2C1974%2C1975%2C1976%2C1977%2C1978%2C1979%2C1980%2C1981%2C1982%2C1983%2C1984%2C1985%2C1986%2C1987%2C1988%2C1989%2C1990%2C1991%2C1992%2C1993%2C1994%2C1995%2C1996%2C1997%2C1998%2C1999%2C2000%2C2001%2C2002%2C2003%2C2004%2C2005%2C2006%2C2007%2C2008%2C2009%2C2010%2C2011%2C2012%2C2013%2C2014%2C2015%2C2016%2C2017%2C2018%2C2019&show_codes=true&show_unit=true&show_flags=true&null_values=false&page_number=1&page_size=100&output_type=objects"
url_2 ="http://fenixservices.fao.org/faostat/api/v1/en/data/QC?area=81&area_cs=FAO&element=2312%2C2510%2C2413&item=800%2C221%2C711%2C515%2C526%2C226%2C366%2C367%2C572%2C203%2C486%2C44%2C782%2C176%2C414%2C558%2C552%2C216%2C181%2C89%2C358%2C101%2C461%2C426%2C217%2C591%2C125%2C378%2C265%2C393%2C108%2C531%2C530%2C220%2C191%2C459%2C689%2C401%2C693%2C698%2C661%2C249%2C656%2C813%2C195%2C554%2C397%2C550%2C577%2C399%2C821%2C569%2C773%2C94%2C512%2C619%2C542%2C541%2C603%2C406%2C720%2C549%2C103%2C507%2C560%2C242%2C839%2C225%2C777%2C336%2C677%2C277%2C780%2C310%2C263%2C592%2C224%2C407%2C497%2C201%2C372%2C333%2C210%2C56%2C446%2C571%2C809%2C671%2C568%2C299%2C79%2C449%2C292%2C702%2C234%2C75%2C254%2C339%2C430%2C260%2C403%2C402%2C490%2C600%2C534%2C521%2C187%2C417%2C687%2C748%2C587%2C197%2C574%2C223%2C489%2C536%2C296%2C116%2C211%2C394%2C754%2C523%2C92%2C788%2C270%2C547%2C27%2C30%2C149%2C836%2C71%2C280%2C328%2C289%2C789%2C83%2C236%2C723%2C373%2C544%2C423%2C157%2C156%2C161%2C267%2C122%2C305%2C495%2C136%2C667%2C826%2C388%2C97%2C275%2C692%2C463%2C420%2C205%2C222%2C567%2C15%2C137%2C135&item_cs=FAO&year=1961%2C1962%2C1963%2C1964%2C1965%2C1966%2C1967%2C1968%2C1969%2C1970%2C1971%2C1972%2C1973%2C1974%2C1975%2C1976%2C1977%2C1978%2C1979%2C1980%2C1981%2C1982%2C1983%2C1984%2C1985%2C1986%2C1987%2C1988%2C1989%2C1990%2C1991%2C1992%2C1993%2C1994%2C1995%2C1996%2C1997%2C1998%2C1999%2C2000%2C2001%2C2002%2C2003%2C2004%2C2005%2C2006%2C2007%2C2008%2C2009%2C2010%2C2011%2C2012%2C2013%2C2014%2C2015%2C2016%2C2017%2C2018%2C2019&show_codes=true&show_unit=true&show_flags=true&null_values=false&page_number=1&page_size=100&output_type=objects" 

def get_data(i):
    payload={}
    headers = {
      'Connection': 'keep-alive',
      'Accept': '*/*',
      'User-Agent': (user-agent inserted here),
      'Origin': 'http://www.fao.org',
      'Referer': 'http://www.fao.org/',
      'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8'
    }
    
    r = requests.get(i, headers=headers)
    return r

all = [url, url_2]

for l in all:
    get_data(l)


# Save the data in a json format
Data = r.json()

# See what we have scraped
Data.keys()

df = pd.json_normalize(Data['data'])

df.info()

我也試過沒有返回 r。 我嘗試了其他 for 循環,例如

for x in range(2):
    get_data(url)
    get_data(url_2)

問題我如何做一個for循環來從網站上的同一網頁上的多個頁面獲取數據? 我看到的唯一選擇是創建具有不同鏈接的新單元格。

您沒有得到 function 返回的數據,您可以通過這樣做來實現

data = []
for l in all:
  x = get_data(l).json()
  data.append(x)

要不就

data = [get_data(l).json() for l in all]

注意:避免使用all作為變量的名稱,因為它是全局 scope 中的 function。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM