簡體   English   中英

美麗湯返回空集

[英]Beautiful Soup returning empty set

Beautiful Soup在本地計算機上可以正常工作,但不能在另一台服務器上工作。

import urllib2
import bs4

url = urllib2.urlopen("http://www.google.com")
html = url.read()
soup = bs4.BeautifulSoup(html)

print soup

打印HTML可以正確輸出google的網頁。 打印湯返回空。

在本地,它工作正常,但是在此redhat機器上,它返回空值。

這與安裝解析器有關嗎? 我查找了其他一些可能的解決方案,他們提到安裝解析器,但到目前為止還算不上好運。

此解決方案“不返回任何內容的美麗湯”不適用於我的問題

只是為了向您證明您的案子是獨一無二的,與Redhat無關。

我從AWS踢出了一個微型Redhat實例,這是從SSH到該全新Redhat計算機的完整過程。 在此處輸入圖片說明

(1)在這里我在新機器上安裝了beautifulsoup4:

$ ssh -i key.pem ec2-user@awsip
The authenticity of host 'awsip' cant be established.
RSA key fingerprint is ....
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'awsip' (RSA) to the list of known hosts.
[ec2-user@awsip ~]$ sudo easy_install beautifulsoup4
Searching for beautifulsoup4
Reading http://pypi.python.org/simple/beautifulsoup4/
...
Installed /usr/lib/python2.6/site-packages/beautifulsoup4-4.3.2-py2.6.egg
Processing dependencies for beautifulsoup4
Finished processing dependencies for beautifulsoup4

(2)我打開python並從htmlsoup獲取來自google的輸出

[ec2-user@awsip ~]$ python
Python 2.6.6 (r266:84292, May 27 2013, 05:35:12)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> html = urllib2.urlopen("http://www.google.com").read()
>>> soup = BeautifulSoup(html)
>>> print html[:100]
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"><head><meta content="Search t
>>> print soup.prettify()[:100]
<!DOCTYPE html>
<html itemscope="" itemtype="http://schema.org/WebPage">
 <head>
  <meta content="Se

要調試它是urllib2或bs4的錯誤:請嘗試運行以下代碼:

from bs4 import BeautifulSoup

html = """
<html>
<head>
</head>
<body>
<div id="1">numberone</div>
<div id="2">numbertwo</div>
</body>
</html>
"""

print BeautifulSoup(html).find('div', {"id":"1"})

如果您成功安裝了beautifulsoup,將獲得如下所示的預期輸出:

<div id="1">numberone</div>

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM