im trying print html using beautifulsoup like this:
load = urllib2.urlopen(url)
soup = BeautifulSoup(load, 'lxml')
characteristics = soup.find('table', { 'class' : 'characteristics-table'})
print characteristics
Get this:
<table class="characteristics-table">
<tr class="characteristics alt">
<td class="name">
Zīmols
</td>
<td>
Emporio Armani</td>
</tr>
<tr class="characteristics">
<td class="name">
<b>Mehānisma tips</b>
</td>
<td>
<b>Mehāniskie automātiskie</b></td>
</tr>...
But need something like this:
<table class="characteristics-table"><tr class="characteristics alt"><td class="name">Zīmols</td><td>...
How to do it?
If you just want to remove the newlines in characteristics
, then use str.replace
to remove them, by replacing newlines with an empty string ''
:
print str(characteristics).replace('\n', '').replace('\r\n', '')
The first one replaces unix-style newlines and the second one, applied to the result of the first, replaces windows-style newlines.
Edit: the .replace
has to be applied to the str()
of the returned obj from beautifulsoup's find.
''.join(characteristics.split('\n')) #or \r\n on Windows
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.