简体   繁体   English

如何使用Beautiful Soup和Python提取NASDAQ网站中表格的HTML代码

[英]How to extract HTML code for a table in NASDAQ site using Beautiful Soup and Python

So basically, I want to read the HTML code for this table in the following link: https://www.nasdaq.com/symbol/aapl/revenue-eps 因此,基本上,我想在以下链接中阅读此表的HTML代码: https : //www.nasdaq.com/symbol/aapl/revenue-eps

To do this, I have used python and Beautiful Soup. 为此,我使用了python和Beautiful Soup。

import urllib
from bs4 import BeautifulSoup
import csv
url = urllib.urlopen("https://www.nasdaq.com/symbol/aapl/revenue-eps")
mylist = []
soup = BeautifulSoup(url,"html.parser")
my_table = soup.find('table',{'class':'ipos'})
print(my_table)

The code above is what I have attempted. 上面的代码是我尝试过的。 When I right click and select inspect on the table, the table name I find is called 'ipos' but when I try to put it in this code, it just doesnt seem to work. 当我右键单击并在表上选择“检查”时,我找到的表名称称为“ ipos”,但是当我尝试将其放入此代码中时,它似乎不起作用。 The only output I get is "None" 我得到的唯一输出是“无”

I have tested this with another site, and it works perfectly. 我已经在另一个网站上对此进行了测试,并且效果很好。 When i used that link and used the name of the table in that HTML code, I am able to obtain the HTML code for that table perfectly. 当我使用该链接并在该HTML代码中使用表的名称时,我能够完美地获取该表的HTML代码。 However, that is not the case with this one. 但是,事实并非如此。 Any assistance on this would be really appreciated. 在这方面的任何帮助将不胜感激。

The table you see is inside the <iframe> . 您看到的表在<iframe>内部。 To load the content of this <iframe> you can use this script: 要加载此<iframe>的内容,可以使用以下脚本:

import requests
from bs4 import BeautifulSoup

url = "https://www.nasdaq.com/symbol/aapl/revenue-eps"
soup = BeautifulSoup(requests.get(url).text,"html.parser")

iframe_url = soup.select_one('iframe#frmMain')['src']

requests.packages.urllib3.disable_warnings()
soup = BeautifulSoup(requests.get(iframe_url, verify=False).text,"html.parser")

table = soup.select_one('table.ipos')

for tr in table.select('tr'):
    for td in tr.select('td'):
        print('{: <30}'.format(td.get_text(strip=True)), end='')
    print()

Prints: 印刷品:

Revenue / EPS Summary *                                     Revenue / EPS Summary *                                     
                              Revenue / EPS Summary *                                     


Fiscal Quarter                                              2019(Fiscal Year)                                           2018(Fiscal Year)                                           2017(Fiscal Year)             


December                                                                                                                
Revenue                       $84,310(m)                    $88,293(m)                    $78,351(m)                    
EPS                           4.18 (12/29/2018)             3.89 (12/30/2017)             3.36 (12/31/2016)             
Dividends                     0.73                          0.63                          0.57                          

March                                                                                                                   
Revenue                       $58,015(m)                    $61,137(m)                    $52,896(m)                    
EPS                           2.48 (3/30/2019)              2.74 (3/31/2018)              2.1 (4/1/2017)                
Dividends                     0.77                          0.73                          0.63                          

June                                                                                                                    
Revenue                       $53,809(m)                    $53,265(m)                    $45,408(m)                    
EPS                           2.2 (6/29/2019)               2.36 (6/30/2018)              1.68 (7/1/2017)               
Dividends                     0.77                          0.73                          0.63                          

September  (FYE)                                                                                                        
Revenue                                                     $62,900(m)                    $52,579(m)                    
EPS                                                         2.92 (9/29/2018)              2.07 (9/30/2017)              
Dividends                                                   0.73                          0.63                          


Totals                                                                                                                  
Revenue                       $196,134(m)                   $265,595(m)                   $229,234(m)                   
EPS                           8.86                          11.91                         9.21                          
Dividends                     2.27                          2.82                          2.46                          

Previous 3 Years   

The table loads inside an iframe . 该表将加载到iframe If you inspect the network requests made by this page then you will find a request like this: 如果检查此页面发出的网络请求,则会发现类似以下的请求:

https://fundamentals.nasdaq.com/redpage.asp?selected=AAPL&market=NASDAQ-GS&LogoPath=https%3a%2f%2fwww.nasdaq.com%2flogos%2fAAPL.GIF&coname=Apple%20Inc.

This is the url that loads the table into this page. 这是将表加载到此页面的URL。 Using the above url you can find the table. 使用以上网址,您可以找到表格。

import urllib
from bs4 import BeautifulSoup
import csv
url = urllib.urlopen("https://fundamentals.nasdaq.com/redpage.asp?selected=AAPL&market=NASDAQ-GS&LogoPath=https%3a%2f%2fwww.nasdaq.com%2flogos%2fAAPL.GIF&coname=Apple%20Inc.")
mylist = []
soup = BeautifulSoup(url,"html.parser")
my_table = soup.find('table',{'class':'ipos'})
print(my_table)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM