使用 SODA API Python 從數據集中導入所有行

Question

我正在嘗試導入以下數據集並將其存儲在熊貓數據框中： https : //data.nasa.gov/Space-Science/Meteorite-Landings/gh4g-9sfh/data

我使用以下代碼：

 r = requests.get('https://data.nasa.gov/resource/gh4g-9sfh.json')
 meteor_data = r.json()
 df = pd.DataFrame(meteor_data)
 print(df.shape)

結果數據框只有 1000 行。 我需要它擁有所有 45,716 行。 我該怎么做呢？

Answer 1

查看有關 $limit 參數的文檔

$limit 參數控制返回的總行數，默認為每個請求 1,000 條記錄。

注意：$limit 的最大值為 50,000 條記錄，如果超過該限制，您將收到 400 Bad Request 響應。

所以你只是得到了默認的記錄數。

您將無法在單個 API 調用中獲得超過 50,000 條記錄 - 這將使用 $limit 和 $offset 進行多次調用

嘗試：

https://data.nasa.gov/resource/gh4g-9sfh.json$limit=50000

請參閱為什么當我有應用程序密鑰時，SODA API 上的行數限制為 1,000 行

Answer 2

DO LIKE This ans set limit

import pandas as pd
from sodapy import Socrata

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.nasa.gov", None)

# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.nasa.gov,
#                  MyAppToken,
#                  userame="user@example.com",
#                  password="AFakePassword")

# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("gh4g-9sfh", limit=2000)

# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

使用 SODA API Python 從數據集中導入所有行

問題描述

2 個解決方案

解決方案1
0 2019-12-30 10:59:06

解決方案2
0 2019-12-30 11:20:33

使用 SODA API Python 從數據集中導入所有行

問題描述

2 個解決方案

解決方案1 0 2019-12-30 10:59:06

解決方案2 0 2019-12-30 11:20:33

解決方案1
0 2019-12-30 10:59:06

解決方案2
0 2019-12-30 11:20:33