I wrote a code to extract subparts of tables, but I want to extract every tag from the input, and then store them in a separate html file
from bs4 import BeautifulSoup
soup = BeautifulSoup(myInput)
table = soup.find('table', {'class': '*'})
I expect the code to show me all tables containted on the input text, but it outputs an error code because the * is not defined
EDIT : * means every table in the file, like saying *.txt
class
is the attribute that you are searching for, but you have to tell soup
which class you are using to get table
<table class='HiClass'>
A
</table>
<table class='MiClass'>
B
</table>
<table class='*'>
C
</table>
For instance,
table1 = soup.find('table', {'class': '*'})
table2 = soup.find('table', {'class': 'HiClass'})
You'll get "C" table in table1
and "A" in table2
.
To get all tables, just use
table = soup.findAll('table')
and you will get all elements which use <table>
tag or tags try returned as a list
Demo:
import requests
from bs4 import BeautifulSoup
def get_request(url):
r = requests.get(url)
soup = BeautifulSoup(r.content,'html5lib')
table = soup.findAll('table')
return table
url ='https://www.w3schools.com/html/html_tables.asp'
print(get_request(url))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.