美丽的汤清洁和错误

Question

我有以下代码：

from bs4 import BeautifulSoup
import urllib2
from lxml import html
from lxml.etree import tostring 
trees = urllib2.urlopen('http://aviationweather.gov/adds/metars/index?                             station_ids=KJFK&std_trans=translated&chk_metars=on&hoursStr=most+recent+only&ch    k_tafs=on&submit=Submit').read()
soup = BeautifulSoup(open(trees))
print soup.get_text()
item=soup.findAll(id="info")
print item

但是，当我在窗口上键入汤时，它给我一个错误，而当我的程序运行时，它给了我很长的html代码，

等等。 任何帮助将是巨大的。

Answer 1

第一个问题在这部分中：

trees = urllib2.urlopen('http://aviationweather.gov/adds/metars/index?station_ids=KJFK&std_trans=translated&chk_metars=on&hoursStr=most+recent+only&chk_tafs=on&submit=Submit').read()
soup = BeautifulSoup(open(trees))

trees是一个类似文件的对象，不需要对其调用open()进行修复：

soup = BeautifulSoup(trees, "html.parser")

我们还明确地将html.parser设置为基础解析器。

然后，您需要明确要从页面中提取的内容。 这是获取METAR text值的示例代码：

from bs4 import BeautifulSoup
import urllib2


trees = urllib2.urlopen('http://aviationweather.gov/adds/metars/index?station_ids=KJFK&std_trans=translated&chk_metars=on&hoursStr=most+recent+only&chk_tafs=on&submit=Submit').read()
soup = BeautifulSoup(trees, "html.parser")

item = soup.find("strong", text="METAR text:").find_next("strong").get_text(strip=True).replace("\n", "")
print item

打印KJFK 220151Z 20016KT 10SM BKN250 24/21 A3007 RMK AO2 SLP183 T02440206 。

美丽的汤清洁和错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-07-22 02:11:55

美丽的汤清洁和错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-07-22 02:11:55

解决方案1
0 已采纳 2016-07-22 02:11:55