I'm trying to get the table information from the web-site http://www.o1vsk.lv/index.php/stundu-izmainas . html content of the web-page i need to extract
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen("http://www.o1vsk.lv/index.php/stundu-izmainas").read()
rows=[]
soup=BeautifulSoup(html,"html.parser")
box = soup.find('div', {'class': 'DRight'})
This program gets all content of the page, while I need only one little table in the text format like:
sorry I cannot comment yet due to my reputation is < 50
Here is my solution for you.
table
tag and it will return HTML codetable = box.findAll("table")
df = pd.read_html(str(table))[1]
Unnamed
column to get only the needed columndf.loc[:, ~df.columns.str.match('Unnamed')]
Here is the full code:
from pandas import pd
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen("http://www.o1vsk.lv/index.php/stundu-izmainas").read()
rows=[]
soup=BeautifulSoup(html,"html.parser")
box = soup.find('div', {'class': 'DRight'})
table = box.findAll("table")
df = pd.read_html(str(table))[1]
df.loc[:, ~df.columns.str.match('Unnamed')]
please upvote if this help you:) thanks
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.