在Raspberry Pi中讀取URL

Question

我想讀取URL中存在的數據。 例如，如果我有以下URL：

http://robolab.in/home-automation.html#ON

我想讀取狀態“ ON”，而保留其余URL。 如何才能做到這一點？

Answer 1

您嘗試做的事情稱為網頁抓取。 在使用urllib / urllib2庫的python中，您可以實現此目標。

import urllib

try:
    html=urllib.urlopen('http://robolab.in/home-automation.html#ON')
    htmltext=html.read()
except:
    print 'error opening link'

print htmltext

這會打印您的瀏覽器顯示的html文本。 現在這只是一個字符串...您可以隨時對其進行操作。 但是，如果您安裝了BeautifulSoup，則可以編寫如下代碼：

from bs4 import BeautifulSoup

soup=BeautifulSoup(htmltext)
for script in soup(["script", "style"]):
    script.extract()
text = soup.get_text()
print text

使用此代碼並給出您的網址，我得到了：

Robolab Technologies
Home Automation

OFF

您可以輕松進行

status=''
text=text.strip()
for index,line in enumerate(text):
    if index>3:
        status = line
if 'ON' in status:
    print "it's on"
else:
    print "it's off"

在Raspberry Pi中讀取URL

問題描述

1 個解決方案

解決方案1
2 已采納 2015-12-07 08:13:18

在Raspberry Pi中讀取URL

問題描述

1 個解決方案

解決方案1 2 已采納 2015-12-07 08:13:18

解決方案1
2 已采納 2015-12-07 08:13:18