繁体   English   中英

美丽汤返回空集

[英]Beautiful Soup returning empty set

Beautiful Soup在本地计算机上可以正常工作,但不能在另一台服务器上工作。

import urllib2
import bs4

url = urllib2.urlopen("http://www.google.com")
html = url.read()
soup = bs4.BeautifulSoup(html)

print soup

打印HTML可以正确输出google的网页。 打印汤返回空。

在本地,它工作正常,但是在此redhat机器上,它返回空值。

这与安装解析器有关吗? 我查找了其他一些可能的解决方案,他们提到安装解析器,但到目前为止还算不上好运。

此解决方案“不返回任何内容的美丽汤”不适用于我的问题

只是为了向您证明您的案子是独一无二的,与Redhat无关。

我从AWS踢出了一个微型Redhat实例,这是从SSH到该全新Redhat计算机的完整过程。 在此处输入图片说明

(1)在这里我在新机器上安装了beautifulsoup4:

$ ssh -i key.pem ec2-user@awsip
The authenticity of host 'awsip' cant be established.
RSA key fingerprint is ....
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'awsip' (RSA) to the list of known hosts.
[ec2-user@awsip ~]$ sudo easy_install beautifulsoup4
Searching for beautifulsoup4
Reading http://pypi.python.org/simple/beautifulsoup4/
...
Installed /usr/lib/python2.6/site-packages/beautifulsoup4-4.3.2-py2.6.egg
Processing dependencies for beautifulsoup4
Finished processing dependencies for beautifulsoup4

(2)我打开python并从htmlsoup获取来自google的输出

[ec2-user@awsip ~]$ python
Python 2.6.6 (r266:84292, May 27 2013, 05:35:12)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> html = urllib2.urlopen("http://www.google.com").read()
>>> soup = BeautifulSoup(html)
>>> print html[:100]
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"><head><meta content="Search t
>>> print soup.prettify()[:100]
<!DOCTYPE html>
<html itemscope="" itemtype="http://schema.org/WebPage">
 <head>
  <meta content="Se

要调试它是urllib2或bs4的错误:请尝试运行以下代码:

from bs4 import BeautifulSoup

html = """
<html>
<head>
</head>
<body>
<div id="1">numberone</div>
<div id="2">numbertwo</div>
</body>
</html>
"""

print BeautifulSoup(html).find('div', {"id":"1"})

如果您成功安装了beautifulsoup,将获得如下所示的预期输出:

<div id="1">numberone</div>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM