简体   繁体   English

如何使用 BeautifulSoup4 从 Python 中的网站获得经常更新的 php 文本?

[英]How can I get frequently updated .php text from a website in Python using BeautifulSoup4?

I would like create an automatic script to download a.php textfile from a webpage which is frequently updated.我想创建一个自动脚本来从经常更新的网页下载 a.php 文本文件。 My program uses requests to get the webpage.我的程序使用请求来获取网页。

The code:编码:

import os, pathlib, subprocess,requests, time, sys



url = 'http://metar.vatsim.net/metar.php?id=all'

current_dir = pathlib.Path(__file__).parent
os.chdir(current_dir)




icao = sys.argv[1]
fp = requests.get(url)
mybytes = fp.read()

mystr = mybytes.decode("utf8")
fp.close()

dict = {}

fls = str.splitlines(mystr)
for x in range(len(fls)):
    cur = str.split(fls[x])
    dict[cur[0]] = " ".join(cur)
    
try:
    print(dict[icao])
except:
    print('INCORRECT FORMAT OR AIRPORT ID\n')

When I try to read fp, it shows the err:当我尝试读取 fp 时,它显示错误:

mybytes = fp.read()
AttributeError: 'Response' object has no attribute 'read'

Is there a better way to solve this, I am kind of stuck.有没有更好的方法来解决这个问题,我有点卡住了。

What you are looking for is urllib.request , not requests .您正在寻找的是urllib.request ,而不是requests

Maybe this will work:也许这会起作用:

import urllib.request

fp = urllib.request.urlopen(url)
mybytes = fp.read()

mystr = mybytes.decode("utf8")
fp.close()

This will read the text present in http://metar.vatsim.net/metar.php?id=all .这将读取http://metar.vatsim.net/metar.php?id=all中的文本。

You can absolutely use requests.您绝对可以使用请求。 You then want to extract the .text .然后,您要提取.text

Also, don't overwrite inbuilt dict in the way you are doing.另外,不要以你正在做的方式覆盖内置的dict

import requests

url = 'http://metar.vatsim.net/metar.php?id=all'
fp = requests.get(url)
mystr = fp.text
a_dict = {}

fls = str.splitlines(mystr)

for x in range(len(fls)):
    cur = str.split(fls[x])
    a_dict[cur[0]] = " ".join(cur)
    
try:
    print(a_dict)
except:
    print('INCORRECT FORMAT OR AIRPORT ID\n')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM