[英]I can't find table in beautifulsoup
所以我对这些东西真的很陌生,所以这可能很愚蠢。 但我似乎无法弄清楚为什么这行非常基本的代码找不到任何表格......还试图找到一个表格来基本上获取每一行。 网址: https : //uwflow.com/course/cs136
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = "https://uwflow.com/course/cs136"
# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parser
page_soup = soup(page_html, "html.parser")
enter code here
table = page_soup.findAll('table')
print(table)
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
import pandas as pd
options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
driver.get("https://uwflow.com/course/cs136")
df = pd.read_html(driver.page_source)[0]
df.to_csv('out.csv', index=False)
driver.quit()
输出: 在线查看
我已经选择了
Winter 2020
,如果您想要Spring 2020
所以您需要将[0]
更改为[1]
数据存储在<script>
标签内的页面内。 如果您想要没有selenium
解决方案,您可以使用re
和json
模块来解析数据。
例如:
import re
import json
import requests
url = 'https://uwflow.com/course/cs136'
txt = requests.get(url).text
data = json.loads(re.findall(r'window.pageData.courseObj = (\{.*?});', txt)[0])
# print(json.dumps(data, indent=4)) # <-- uncomment this to see all data
print(data['code'] + ' - ' + data['name'])
print(data['description'])
print('{:<10} {:<10} {:<10}'.format('Class', 'Enrolled', 'Campus'))
for section in data['sections']:
print('{:<10} {:<10} {:<10}'.format(section['class_num'],
str(section['enrollment_total']) + '/' + str(section['enrollment_capacity']),
section['campus']))
印刷:
CS 136 - Elementary Algorithm Design and Data Abstraction
This course builds on the techniques and patterns learned in CS 135 while making the transition to use an imperative language. It introduces the design and analysis of algorithms, the management of information, and the programming mechanisms and methodologies required in implementations. Topics discussed include iterative and recursive sorting algorithms; lists, stacks, queues, trees, and their application; abstract data types and their implementations.
Class Enrolled Campus
6214 90/90 UW U
6011 59/65 UW U
5914 46/90 UW U
6004 90/90 UW U
6048 90/90 UW U
6109 90/90 UW U
6215 87/90 UW U
6260 90/90 UW U
6261 67/90 UW U
6005 64/65 UW U
... and so on.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.