简体   繁体   中英

Scrape data from website with frames or flexbox using python requests and BeautifulSoup

I've been trying to figure this out but with no luck. I found a thread ( How to scrape data from flexbox element/container with Python and Beautiful Soup ) that I thought would help but I can't seem to make any headway.

The site I'm trying to scrape is...http://www.northwest.williams.com/NWP_Portal/. In particular I want to get the data from the tab/frame of 'Storage Levels' but for the life of me I can't seem to navigate to the right spot to get the data. I've tried various iterations of the code below with no success. I've changed 'lxml' to 'html.parser', looked for tables, looked for 'tr' etc but the code always returns empty. I've also tried looking at the network info but when I click on any of the tabs (System Status, PAL/System Balancing etc) I don't see any change in network activity. I'm sure it's something simple that I'm overlooking but I just can't put my finger on it.

from bs4 import BeautifulSoup as soup
import requests

url = 'http://www.northwest.williams.com/NWP_Portal/'

r = requests.get(url)

html = soup(r.content,'lxml')

page = html.findAll('div',{'class':'dailyOperations-panels'})

How can I 'navigate' to the 'Storage Levels' frame/tab? What is the html that I'm actually looking for? Can I do this with just requests and beautiful soup? I'm not opposed to using Selenium but I haven't used it before and would prefer to just use requests and BeautifulSoup if possible.

Thanks in advance!

Hey so what I notice is your are trying to get "dailyOperations-panels" from a div which won't work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM