[英]HTML Parsing gives no response
I'm trying to parse a web page, and that's my code: 我正在尝试解析网页,这是我的代码:
from bs4 import BeautifulSoup
import urllib2
openurl = urllib2.urlopen("http://pastebin.com/archive/Python")
read = BeautifulSoup(openurl.read())
soup = BeautifulSoup(openurl)
x = soup.find('ul', {"class": "i_p0"})
sp = soup.findAll('a href')
for x in sp:
print x
I really with I could be more specific but as the title says, it gives me no response. 我确实可以更具体一些,但正如标题所述,它没有给我任何回应。 No errors, nothing.
没有错误,没事。
First of all, omit the line read = BeautifulSoup(openurl.read())
. 首先,省略
read = BeautifulSoup(openurl.read())
。
Also, the line x = soup.find('ul', {"class": "i_p0"})
doesn't actually make any difference, because you are reusing x
variable in the loop. 同样,行
x = soup.find('ul', {"class": "i_p0"})
实际上没有任何区别,因为您在循环中正在使用x
变量。
Also, soup.findAll('a href')
doesn't find anything. 另外,
soup.findAll('a href')
找不到任何东西。
Also, instead of old-fashioned findAll()
, there is a find_all()
in BeautifulSoup4. 此外,在BeautifulSoup4中有一个
find_all()
代替了老式的findAll()
。
Here's the code with several alterations: 这是经过一些修改的代码:
from bs4 import BeautifulSoup
import urllib2
openurl = urllib2.urlopen("http://pastebin.com/archive/Python")
soup = BeautifulSoup(openurl)
sp = soup.find_all('a')
for x in sp:
print x['href']
This prints the values of href
attribute of all links on the page. 这将打印页面上所有链接的
href
属性值。
Hope that helps. 希望能有所帮助。
I altered a couple of lines in your code and I do get a response, not sure if that is what you want though. 我在您的代码中更改了几行,但确实得到了响应,但是不确定这是否是您想要的。
Here: 这里:
openurl = urllib2.urlopen("http://pastebin.com/archive/Python")
soup = BeautifulSoup(openurl.read()) # This is what you need to use for selecting elements
# soup = BeautifulSoup(openurl) # This is not needed
# x = soup.find('ul', {"class": "i_p0"}) # You don't seem to be making a use of this either
sp = soup.findAll('a')
for x in sp:
print x.get('href') #This is to get the href
Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.