简体   繁体   English

抓取表格时没有数据

[英]Getting no data when scraping a table

I am trying to scrape historical data from a table in coinmarketcap.我正在尝试从 coinmarketcap 的表中抓取历史数据。 However, the code that I run gives back "no data."但是,我运行的代码返回“无数据”。 I thought it would be fairly easy, but not sure what I am missing.我认为这会很容易,但不确定我错过了什么。

url = "https://coinmarketcap.com/currencies/bitcoin/historical-data/"

data = requests.get(url)

bs=BeautifulSoup(data.text, "lxml")
table_body=bs.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
    cols=row.find_all('td')
    cols=[x.text.strip() for x in cols]
    print(cols)

Output: Output:

C:\Users\Ejer\anaconda3\envs\pythonProject\python.exe C:/Users/Ejer/PycharmProjects/pythonProject/CloudSQL_test.py
['No Data']

Process finished with exit code 0

Your problem basically is you're trying to get a table but this table is dynamically created by JS in this case you need to call an interpreter for this JS.您的问题基本上是您试图获取一个表,但该表是由 JS 动态创建的,在这种情况下您需要为此 JS 调用解释器。 But however you just can check the.network monitor on your browser and you can get the requests and probably contains a full JSON or XML raw data and you don't need to scrape.但无论如何,您只需在浏览器上查看 .network 监视器,就可以获得请求,并且可能包含完整的 JSON 或 XML 原始数据,您不需要抓取。 I did it and I got this request:我做到了,我收到了这个请求:

https://web-api.coinmarketcap.com/v1/cryptocurrency/ohlcv/historical?id=1&convert=USD&time_start=1604016000&time_end=1609286400

Check it out and I hope help you!检查一下,希望对您有所帮助!

You don't need to scrape the data, you can get request it:你不需要抓取数据,你get请求它:

import time
import requests


def get_timestamp(datetime: str):
    return int(time.mktime(time.strptime(datetime, '%Y-%m-%d %H:%M:%S')))


def get_btc_quotes(start_date: str, end_date: str):
    start = get_timestamp(start_date)
    end = get_timestamp(end_date)
    url = f'https://web-api.coinmarketcap.com/v1/cryptocurrency/ohlcv/historical?id=1&convert=USD&time_start={start}&time_end={end}'
    return requests.get(url).json()


data = get_btc_quotes(start_date='2020-12-01 00:00:00',
                      end_date='2020-12-10 00:00:00')

import pandas as pd
# making A LOT of assumptions here, hopefully the keys don't change in the future
data_flat = [quote['quote']['USD'] for quote in data['data']['quotes']]
df = pd.DataFrame(data_flat)

print(df)

Output: Output:

           open          high           low         close        volume    market_cap                 timestamp
0  18801.743593  19308.330663  18347.717838  19201.091157  3.738770e+10  3.563810e+11  2020-12-02T23:59:59.999Z
1  19205.925404  19566.191884  18925.784434  19445.398480  3.193032e+10  3.609339e+11  2020-12-03T23:59:59.999Z
2  19446.966422  19511.404714  18697.192914  18699.765613  3.387239e+10  3.471114e+11  2020-12-04T23:59:59.999Z
3  18698.385279  19160.449265  18590.193675  19154.231131  2.724246e+10  3.555639e+11  2020-12-05T23:59:59.999Z
4  19154.180593  19390.499895  18897.894072  19345.120959  2.529378e+10  3.591235e+11  2020-12-06T23:59:59.999Z
5  19343.128798  19411.827676  18931.142919  19191.631287  2.689636e+10  3.562932e+11  2020-12-07T23:59:59.999Z
6  19191.529463  19283.478339  18269.945444  18321.144916  3.169229e+10  3.401488e+11  2020-12-08T23:59:59.999Z
7  18320.884784  18626.292652  17935.547820  18553.915377  3.442037e+10  3.444865e+11  2020-12-09T23:59:59.999Z
8  18553.299728  18553.299728  17957.065213  18264.992107  2.554713e+10  3.391369e+11  2020-12-10T23:59:59.999Z

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM