繁体   English   中英

抓取表格时没有数据

[英]Getting no data when scraping a table

我正在尝试从 coinmarketcap 的表中抓取历史数据。 但是,我运行的代码返回“无数据”。 我认为这会很容易,但不确定我错过了什么。

url = "https://coinmarketcap.com/currencies/bitcoin/historical-data/"

data = requests.get(url)

bs=BeautifulSoup(data.text, "lxml")
table_body=bs.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
    cols=row.find_all('td')
    cols=[x.text.strip() for x in cols]
    print(cols)

Output:

C:\Users\Ejer\anaconda3\envs\pythonProject\python.exe C:/Users/Ejer/PycharmProjects/pythonProject/CloudSQL_test.py
['No Data']

Process finished with exit code 0

您的问题基本上是您试图获取一个表,但该表是由 JS 动态创建的,在这种情况下您需要为此 JS 调用解释器。 但无论如何,您只需在浏览器上查看 .network 监视器,就可以获得请求,并且可能包含完整的 JSON 或 XML 原始数据,您不需要抓取。 我做到了,我收到了这个请求:

https://web-api.coinmarketcap.com/v1/cryptocurrency/ohlcv/historical?id=1&convert=USD&time_start=1604016000&time_end=1609286400

检查一下,希望对您有所帮助!

你不需要抓取数据,你get请求它:

import time
import requests


def get_timestamp(datetime: str):
    return int(time.mktime(time.strptime(datetime, '%Y-%m-%d %H:%M:%S')))


def get_btc_quotes(start_date: str, end_date: str):
    start = get_timestamp(start_date)
    end = get_timestamp(end_date)
    url = f'https://web-api.coinmarketcap.com/v1/cryptocurrency/ohlcv/historical?id=1&convert=USD&time_start={start}&time_end={end}'
    return requests.get(url).json()


data = get_btc_quotes(start_date='2020-12-01 00:00:00',
                      end_date='2020-12-10 00:00:00')

import pandas as pd
# making A LOT of assumptions here, hopefully the keys don't change in the future
data_flat = [quote['quote']['USD'] for quote in data['data']['quotes']]
df = pd.DataFrame(data_flat)

print(df)

Output:

           open          high           low         close        volume    market_cap                 timestamp
0  18801.743593  19308.330663  18347.717838  19201.091157  3.738770e+10  3.563810e+11  2020-12-02T23:59:59.999Z
1  19205.925404  19566.191884  18925.784434  19445.398480  3.193032e+10  3.609339e+11  2020-12-03T23:59:59.999Z
2  19446.966422  19511.404714  18697.192914  18699.765613  3.387239e+10  3.471114e+11  2020-12-04T23:59:59.999Z
3  18698.385279  19160.449265  18590.193675  19154.231131  2.724246e+10  3.555639e+11  2020-12-05T23:59:59.999Z
4  19154.180593  19390.499895  18897.894072  19345.120959  2.529378e+10  3.591235e+11  2020-12-06T23:59:59.999Z
5  19343.128798  19411.827676  18931.142919  19191.631287  2.689636e+10  3.562932e+11  2020-12-07T23:59:59.999Z
6  19191.529463  19283.478339  18269.945444  18321.144916  3.169229e+10  3.401488e+11  2020-12-08T23:59:59.999Z
7  18320.884784  18626.292652  17935.547820  18553.915377  3.442037e+10  3.444865e+11  2020-12-09T23:59:59.999Z
8  18553.299728  18553.299728  17957.065213  18264.992107  2.554713e+10  3.391369e+11  2020-12-10T23:59:59.999Z

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM