如何使用python，beautifulsoup不间断地打印html？

Question

im trying print html using beautifulsoup like this: 我正在尝试使用beautifulsoup这样打印html：

load = urllib2.urlopen(url)
soup = BeautifulSoup(load, 'lxml')
characteristics = soup.find('table', { 'class' : 'characteristics-table'})
print characteristics

Get this: 得到这个：

<table class="characteristics-table">
<tr class="characteristics alt">
<td class="name">
Zīmols
</td>
<td>
Emporio Armani</td>
</tr>
<tr class="characteristics">
<td class="name">
<b>Mehānisma tips</b>
</td>
<td>
<b>Mehāniskie automātiskie</b></td>
</tr>...

But need something like this: 但是需要这样的东西：

<table class="characteristics-table"><tr class="characteristics alt"><td class="name">Zīmols</td><td>...

How to do it? 怎么做？

Answer 1

If you just want to remove the newlines in characteristics , then use str.replace to remove them, by replacing newlines with an empty string '' : 如果只想删除characteristics中的换行符，则使用str.replace来删除它们，方法是用空字符串''替换换行符：

print str(characteristics).replace('\n', '').replace('\r\n', '')

The first one replaces unix-style newlines and the second one, applied to the result of the first, replaces windows-style newlines. 第一个替换unix样式的换行符，第二个应用于第一个结果，替换Windows样式的换行符。

Edit: the .replace has to be applied to the str() of the returned obj from beautifulsoup's find. 编辑： .replace必须应用于beautifulsoup的查找返回的obj的str() 。

Answer 2

''.join(characteristics.split('\n'))   #or \r\n on Windows

如何使用python，beautifulsoup不间断地打印html？

问题描述

2 个解决方案

解决方案1
2 2017-10-28 02:46:16

解决方案2
1 已采纳 2017-10-28 02:51:27

如何使用python，beautifulsoup不间断地打印html？

问题描述

2 个解决方案

解决方案1 2 2017-10-28 02:46:16

解决方案2 1 已采纳 2017-10-28 02:51:27

解决方案1
2 2017-10-28 02:46:16

解决方案2
1 已采纳 2017-10-28 02:51:27