[英]Trying to find something specific in html code
I am trying to find a specific ID to an altcoin, but not sure how to do it.我正在尝试查找山寨币的特定 ID,但不知道该怎么做。 When I print, I get a very long json script and I get lost in trying to find it.
当我打印时,我得到一个很长的 json 脚本,我在试图找到它时迷路了。 Is there an easier way?
有没有更简单的方法?
from bs4 import BeautifulSoup
import requests
import pandas as pd
import json
import time
cmc = requests.get('https://coinmarketcap.com/')
soup = BeautifulSoup(cmc.content, 'html.parser')
print(soup.prettify())
The output I want is to determine the exact id corresponding to the altcoin.我想要的 output 是确定与山寨币对应的确切 id。 The output below is for one coin, but it is a long list, and I can not easily find the exact one without manually looking.
下面的 output 是一枚硬币,但清单很长,如果不手动查找,我很难找到确切的一枚。
{"id":1,"name":"Bitcoin","symbol":"BTC","slug":"bitcoin","max_supply":21000000,"circulating_supply":18614718,"total_supply":18614718,"last_updated":"2021-01-30T15:00:02.000Z","quote":{"USD":{"price":34177.31601866782,"volume_24h":83208963467.24487,"percent_change_1h":1.15037986,"percent_change_24h":-10.87555443,"percent_change_7d":7.03677315,"percent_change_30d":19.84946991,"market_cap":636201099684.3843,"last_updated":"2021-01-30T15:00:02.000Z"}},"rank":1,"noLazyLoad":true}
I took a closer look at the HTML.我仔细查看了 HTML。
It appears that the JSON string data you seek is inside of a <script>
tag with id "__NEXT_DATA__"
.您寻找的 JSON 字符串数据似乎位于 ID 为
"__NEXT_DATA__"
的<script>
标记内。
I'm not that familiar with BeautifulSoup so a more elegant way may exist to get the data.我对 BeautifulSoup 不太熟悉,因此可能存在更优雅的方式来获取数据。 Here is the code I used.
这是我使用的代码。
cmc = requests.get('https://coinmarketcap.com/')
soup = BeautifulSoup(cmc.content, 'html.parser')
for item in soup.select('script[id="__NEXT_DATA__"]'):
data = json.loads(item.string) # load JSON string as a dict
desired_data = data["props"]["initialState"]["cryptocurrency"]["listingLatest"][
"data"
]
print(
json.dumps( # pretty output string
desired_data,
indent=2,
),
)
TRUNCATED OUTPUT:截断 OUTPUT:
[
{
"id": 1,
"name": "Bitcoin",
"symbol": "BTC",
"slug": "bitcoin",
"max_supply": 21000000,
"circulating_supply": 18614718,
"total_supply": 18614718,
"last_updated": "2021-01-30T14:51:02.000Z",
"quote": {
"USD": {
"price": 34138.18238095427,
"volume_24h": 83651976977.0413,
"percent_change_1h": 1.36922474,
"percent_change_24h": -9.82670796,
"percent_change_7d": 6.33079444,
"percent_change_30d": 19.72629419,
"market_cap": 635472638054.0323,
"last_updated": "2021-01-30T14:51:02.000Z"
}
},
"rank": 1,
"noLazyLoad": true
},
{
"id": 1027,
"name": "Ethereum",
"symbol": "ETH",
"slug": "ethereum",
"max_supply": null,
"circulating_supply": 114465285.999,
"total_supply": 114465285.999,
"last_updated": "2021-01-30T14:51:02.000Z",
"quote": {
"USD": {
"price": 1364.155096452962,
"volume_24h": 38819994919.48616,
"percent_change_1h": 1.95180621,
"percent_change_24h": -3.86551103,
"percent_change_7d": 10.22893483,
"percent_change_30d": 85.96783538,
"market_cap": 156148403262.48172,
"last_updated": "2021-01-30T14:51:02.000Z"
}
},
"rank": 2,
"noLazyLoad": true
},…
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.