BeautifulSoup returns No Data Recorded when getting table from web

Question

New to web scraping.

I need to get the Daily Observations table (the long table at the end of the page) data from the page:

https://www.wunderground.com/history/daily/us/tx/greenville/KGVT/date/2015-01-05?cm_ven=localwx_history

The html of the table starts from <table _ngcontent-c16="" class="tablesaw-sortable" id="history-observation-table">

My code is:

url = "https://www.wunderground.com/history/daily/us/tx/greenville/KGVT/date/2015-01-05?cm_ven=localwx_history"
html = urlopen(url)
soup = BeautifulSoup(html,'lxml')
soup.findAll(class_="region-content-observation")

And the output is:

[<div class="region-content-observation">
 <city-history-observation _nghost-c34=""><div _ngcontent-c34="">
 <div _ngcontent-c34="" class="observation-title">Daily Observations</div>
 <!-- -->
     No Data Recorded

   <!-- -->
 </div></city-history-observation>
 </div>]

So it's not getting the table and returned No Data Recorded, but it did get the title.

And When I tried

soup.findAll(class_="tablesaw-sortable")

or

soup.findAll('tr')

it only returned empty list.

Does anyone know where went wrong?

Answer 1

If you open the web page in Firefox, you can use the Network tab from its Developer Tools to see all the different web resources that are downloaded. The data you are interested in is actually provided by this JSON file – which can be retrieved and then parsed using Python's json library.

Note: I've never scraped a site that uses API keys so I'm not sure about the ethics or best practice in this situation. As a test, I was able to download the JSON file without any problems. However, I suspect Weather Underground wouldn't want you using their key too many times – and it looks like they no longer provide free weather API keys .

BeautifulSoup returns No Data Recorded when getting table from web

Question

1 answers

solution1
0 ACCPTED 2018-09-18 16:49:31

BeautifulSoup returns No Data Recorded when getting table from web

Question

1 answers

solution1 0 ACCPTED 2018-09-18 16:49:31

solution1
0 ACCPTED 2018-09-18 16:49:31