[英]How to get a certain table tag from div tag inside of html?
I'm trying to get the table information from the web-site http://www.o1vsk.lv/index.php/stundu-izmainas .我正在尝试从网站http://www.o1vsk.lv/index.php/stundu-izmainas获取表格信息。 html content of the web-page i need to extract
html 我需要提取的网页内容
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen("http://www.o1vsk.lv/index.php/stundu-izmainas").read()
rows=[]
soup=BeautifulSoup(html,"html.parser")
box = soup.find('div', {'class': 'DRight'})
This program gets all content of the page, while I need only one little table in the text format like:该程序获取页面的所有内容,而我只需要一个文本格式的小表格,例如:
sorry I cannot comment yet due to my reputation is < 50抱歉,由于我的声誉 < 50,我还不能发表评论
Here is my solution for you.这是我给你的解决方案。
table
tag and it will return HTML codetable
标签,它将返回 HTML 代码table = box.findAll("table")
df = pd.read_html(str(table))[1]
Unnamed
column to get only the needed columnUnnamed
的列以仅获取所需的列df.loc[:, ~df.columns.str.match('Unnamed')]
Here is the full code:这是完整的代码:
from pandas import pd
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen("http://www.o1vsk.lv/index.php/stundu-izmainas").read()
rows=[]
soup=BeautifulSoup(html,"html.parser")
box = soup.find('div', {'class': 'DRight'})
table = box.findAll("table")
df = pd.read_html(str(table))[1]
df.loc[:, ~df.columns.str.match('Unnamed')]
please upvote if this help you:) thanks如果这对您有帮助,请点赞:) 谢谢
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.