In a container of div
with a specific class, I have some text with different id's dd
, dl
and dt
having spaces and lines and some special character like \\, ?
etc. How to get rid of it ?
container = soup.find_all(name="div", attrs={"class":"4_square"})
size of container is 1. Any suggestions?
You may try to find all dd
and dt
and then replace all special characters and empty spaces by replacing it to the default value. I have mentioned below code that you may try.
subject = container[0]
for i in range (0,len(subject.dl.findAll('dd'))):
temp = subject.dl.find_all('dt')[i].text.strip('\n').replace('\n','').replace(' ','').replace('\?','')
temp1 = subject.dl.find_all('dd')[i].text.strip('\n').replace('\n','').replace(' ','').replace('\?','')
temp and temp1 will give you the text. I hope this works for you.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.