简体   繁体   English

如何在之间打印数据 <p> 和 </p> beautifulsoup

[英]how to print data between <p> and </p> beautifulsoup

I am writing a script to scrape the highs and lows of weather on a weather site, and I have gotten it to print what I need, but with all of the Beautifulsoup tags. 我正在编写一个脚本,以在天气站点上刮擦天气的高潮和低潮,而我已经获得了它来打印我需要的东西,但带有所有Beautifulsoup标签。

This is my current script: 这是我当前的脚本:

import urllib2
from bs4 import BeautifulSoup

website = "http://forecast.weather.gov/MapClick.php?lat=39.90489741058809&lon=-82.7617367885212&site=all&smap=1#.VPyDd4F4qAQ"

r1 = urllib2.urlopen(website)
mydata = r1.read()
soup = BeautifulSoup(mydata)
s = soup.prettify()
x = soup.find_all("p", attrs={"class": "point-forecast-icons-low"})
y = soup.find_all("p", attrs={"class": "point-forecast-icons-high"})

print x
print y

It gives me this: 它给了我这个:

[<p class="point-forecast-icons-low">Low: 40 °F</p>, <p class="point-forecast-icons-low">Low: 48 °F</p>, <p class="point-forecast-icons-low">Low: 26 °F</p>, <p class="point-forecast-icons-low">Low: 31 °F</p>, <p class="point-forecast-icons-low">Low: 32 °F</p>]
[<p class="point-forecast-icons-high">High: 67 °F</p>, <p class="point-forecast-icons-high">High: 53 °F</p>, <p class="point-forecast-icons-high">High: 44 °F</p>, <p class="point-forecast-icons-high">High: 47 °F</p>]

But I just want the parts that say "High: ##" and "Low: ##." 但是我只想说“高:##”和“低:##”的部分。

How do I do that?? 我怎么做??

You have lists of elements. 您有元素列表。 Use the Element.text attribute on each separate element : 在每个单独的元素上使用Element.text属性:

lows = [low.text for low in soup.find_all("p", class_="point-forecast-icons-low")]
highs = [high.text for high in soup.find_all("p", class_="point-forecast-icons-high")]

This produces: 这将产生:

>>> lows = [low.text for low in soup.find_all("p", class_="point-forecast-icons-low")]
>>> highs = [high.text for high in soup.find_all("p", class_="point-forecast-icons-high")]
>>> lows
[u'Low: 40 \xb0F', u'Low: 48 \xb0F', u'Low: 26 \xb0F', u'Low: 31 \xb0F', u'Low: 32 \xb0F']
>>> highs
[u'High: 67 \xb0F', u'High: 53 \xb0F', u'High: 44 \xb0F', u'High: 47 \xb0F']

The ° in °F is not an ASCII printable character so it is represented as the \\xb0 escape sequence when shown in a list. 所述°°F不是ASCII打印字符所以它被表示为\\xb0在一个列表中所示,当转义序列。 You can print the individual values: 您可以打印单个值:

>>> print highs[0]
High: 67 °F

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM