[英]Beautifulsoup find_all return an empty list
我試圖刮一張表,但使用 find_all 時返回的所有內容都是空列表。
這是網站的鏈接: link
from bs4 import BeautifulSoup
import requests
html_text = requests.get('some url').text
soup = BeautifulSoup(html_text, 'lxml')
table = soup.find('table', class_ = 'tinytable')
rows = table.find_all('tr')
for row in rows:
columns = row.find_all('td')
print(columns) # Prints out empty lists
如果我插入打印行,我會得到這個:
<td align="right"></td>
<td align="right"><div><a href="http://www.sec.gov/Archives/edgar/data/1841804/000089924321029825/xslF345X03/doc4.xml" target="_blank" title="SEC Form 4">2021-07-23 21:48:35</a></div></td>
<td align="right"><div>2021-07-21</div></td>
<td><b> <a href="/INST" onmouseout="UnTip()" onmouseover="Tip('<img src=\'https://www.profitspi.com/stock/stock-charts.ashx?chart=INST&v=stock-chart&vs=637453390322078326\' alt=\'\' width=\'360px\' height=\'280px\'>', DELAY, 1)">INST</a></b></td>
<td><a href="/INST">Instructure Holdings, Inc.</a></td>
<td><a href="/insider/Bowen-Dale-E./1862625" title="476,765 direct shares
C/O Instructure Holdings, Inc.
6330 South East, Suite 700
Salt Lake City, UT 84121">Bowen Dale E.</a></td>
<td>CFO</td>
<td>P - Purchase</td>
<td align="right">$20.00</td>
<td align="right">+26,250</td>
<td align="right">476,765</td>
<td align="right">+6%</td>
<td align="right">+$525,000</td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
我在哪里可以看到使用 find_all 時應該返回“td”標簽
import pandas as pd
df = pd.read_html('http://openinsider.com/screener?s=&o=&pl=&ph=&ll=&lh=&fd=730&fdr=&td=0&tdr=&fdlyl=&fdlyh=&daysago=&xp=1&xs=1&vl=&vh=&ocl=&och=&sic1=-1&sicl=100&sich=9999&grp=0&nfl=&nfh=&nil=&nih=&nol=&noh=&v2l=&v2h=&oc2l=&oc2h=&sortcol=0&cnt=100&page=1',
attrs={'class': 'tinytable'})[0]
print(df)
df.to_csv('data.csv', index=False, encoding='utf-8-sig')
輸出:
X Filing Date Trade Date Ticker ... 1d 1w 1m 6m
0 NaN 2021-07-23 21:48:35 2021-07-21 INST ... NaN NaN NaN NaN
1 NaN 2021-07-23 21:48:13 2021-07-21 INST ... NaN NaN NaN NaN
2 NaN 2021-07-23 21:46:08 2021-07-23 ROCCU ... NaN NaN NaN NaN
3 NaN 2021-07-23 21:45:35 2021-07-21 INST ... NaN NaN NaN NaN
4 NaN 2021-07-23 21:25:19 2021-07-23 DKNG ... NaN NaN NaN NaN
.. ... ... ... ... ... .. .. .. ..
95 DM 2021-07-23 16:32:14 2021-07-21 CMG ... NaN NaN NaN NaN
96 D 2021-07-23 16:30:57 2021-07-22 HRMY ... NaN NaN NaN NaN
97 NaN 2021-07-23 16:30:44 2021-07-21 ABNB ... NaN NaN NaN NaN
98 D 2021-07-23 16:30:39 2021-07-21 TWST ... NaN NaN NaN NaN
99 D 2021-07-23 16:30:31 2021-07-21 TWST ... NaN NaN NaN NaN
[100 rows x 17 columns]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.