[英]get the href link in table columns in python with -beautiful-soup
我有這個數據,你可以看到:
[<td><span></span></td>, <td><span></span></td>, <td><a class="cmc-link" href="/currencies/renbtc/"><span class="circle"></span><span>renBTC</span><span class="crypto-symbol">RENBTC</span></a></td>, <td><span>$<!-- -->61947.68</span></td>, <td><span></span></td>]
我想提取href
鏈接,正如您在此處看到的/currencies/renbtc/
。
這是我的代碼:
from bs4 import BeautifulSoup
import requests
try:
r = requests.get('https://coinmarketcap.com/')
soup = BeautifulSoup(r.text, 'lxml')
table = soup.find('table', class_='cmc-table')
for row in table.tbody.find_all('tr'):
# Find all data for each column
columns = row.find_all('td')
print(columns)
except requests.exceptions.RequestException as e:
print(e)
但結果是整個列。
迭代列表中的<td>
,如果<td>
有<a>
( if td.a
),則.get('href')
的td.a
:
from bs4 import BeautifulSoup
import requests
try:
r = requests.get('https://coinmarketcap.com/')
soup = BeautifulSoup(r.text, 'lxml')
table = soup.find('table', class_='cmc-table')
for row in table.tbody.find_all('tr'):
# Find all data for each column
columns = row.find_all('td')
for td in columns:
if td.a:
print(td.a.get('href'))
# theoretically for performance you can
# break
# here to stop the loop if you expect only one anchor link per `td`
except requests.exceptions.RequestException as e:
print(e)
對包含<a>
列中的元素進行操作,選擇它並獲取其href
:
link = columns[2].a['href']
from bs4 import BeautifulSoup
import requests
try:
r = requests.get('https://coinmarketcap.com/')
soup = BeautifulSoup(r.text, 'lxml')
table = soup.find('table', class_='cmc-table')
for row in table.tbody.find_all('tr'):
# Find all data for each column
columns = row.find_all('td')
link = columns[2].a['href']
print(link)
except requests.exceptions.RequestException as e:
print(e)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.