简体   繁体   English

如何修复python 3.x中的“在'function'中找不到引用'x'”?

[英]How to fix “Cannot find reference 'x' in 'function' ” in python 3.x?

I am currently working on my first python script, which is supposed to check a URL every XX seconds and notify me if the text on the url changed. 我目前正在使用我的第一个python脚本,该脚本应该每隔XX秒检查一次URL,并通知我url上的文本是否更改。

My problem is that I can't find a way to refer to a variable outside the function it was defined in. 我的问题是我找不到在其定义的函数之外引用变量的方法。

I tried to use global variable, but this resulted in errors as well. 我尝试使用全局变量,但这也会导致错误。

The current version refers to the variable soup within the scrape function ( scrape.soup = doesn't return errors, while `soup = does). 当前版本引用了scrape函数中的变量汤( scrape.soup =不返回错误,而`soup =则返回)。

However in line 15 it still has issues to find the variable soup as it gives me this notification: 但是在第15行中,它仍然存在查找变量汤的问题,因为它给了我以下通知:

Cannot find reference 'soup' in 'function' 在“功能”中找不到参考“汤”

from bs4 import BeautifulSoup
import requests
import time

sleeptime = 15

def scrape():
    url = "http://www.pythonforbeginners.com"
    source_code = requests.get(url)
    plain_text = source_code.text
    scrape.soup = BeautifulSoup(plain_text, 'html.parser')

while 1:
    if scrape() == scrape.soup:
        print('Nothing Changed')
    else:
        print("Something Changed!")
        break
    time.sleep(sleeptime)

I expect the script to save the html_text of 'url' in the variable 'soup'. 我希望脚本将'url'的html_text保存在变量'soup'中。

The script should compare the latest scrape with the old scrape and print notifications for each result. 脚本应将最新的刮擦与旧的刮擦进行比较,并为每个结果打印通知。

In case nothing changed, it should print "nothing changed". 如果没有任何更改,则应打印“没有任何更改”。

In case it changed, it should print "Something Changed". 万一它改变了,它应该打印“ Something Changed”。

The script is being without any errors. 该脚本没有任何错误。 However, when running the script, it always returns "Something changed". 但是,在运行脚本时,它总是返回“更改了某些内容”。

I am pretty sure this is not correct, as it wouldn't make sense that the content on the site changed every 15 seconds. 我非常确定这是不正确的,因为站点上的内容每15秒更改一次是没有意义的。 In addition I feel there is an error with time.seep, as the script runs only once and doesn't repeat every 15 seconds 此外,我觉得time.seep出现错误,因为脚本仅运行一次,并且不会每15秒重复一次

I would really appreciate any clues that would point me into the right direction. 我会很感激任何能为我指明正确方向的线索。

I think you're missing the concept of return . 我认为您错过了return的概念。

def scrape():
    url = "http://www.pythonforbeginners.com"
    source_code = requests.get(url)
    plain_text = source_code.text
    return BeautifulSoup(plain_text, 'html.parser')

Now scrape() will always return a new object every time it is called. 现在scrape()将始终在每次调用新对象时返回一个新对象。 You can't simply check if the function returns the same thing (to mean the page content hasn't changed) because it never will. 您不能简单地检查该函数是否返回相同的内容(以表示页面内容未更改),因为它永远不会返回。

If you only care that the content has changed (at all), then you don't even need to use Beautiful Soup. 如果您只关心内容已更改(根本),那么您甚至不需要使用Beautiful Soup。 Just store the page content and compare that each cycle. 只需存储页面内容并比较每个周期。

Otherwise you should use your Beautiful Soup object to dig in to the page content and extract just the parts you're watching to change. 否则,您应该使用Beautiful Soup对象来挖掘页面内容并仅提取要更改的部分。 Then save that text and compare it each cycle. 然后保存文本,并在每个周期进行比较。

Your code 您的密码

 def scrape(): url = "http://www.pythonforbeginners.com" source_code = requests.get(url) plain_text = source_code.text scrape.soup = BeautifulSoup(plain_text, 'html.parser') 

does not return anything, hence it returns None implicitly. 不返回任何内容,因此它隐式返回None

When comparing 比较时

if scrape() == scrape.soup:

it will always be different, because scrape() == None and scrape.soup == .. some BeautifulSoup(...) return which is not None . 它总是会有所不同,因为scrape() == Nonescrape.soup == .. some BeautifulSoup(...) return结果不是None

It would be better to do: 最好这样做:

def scrape():
    url = "http://www.pythonforbeginners.com"
    source_code = requests.get(url)
    plain_text = source_code.text
    return BeautifulSoup(plain_text, 'html.parser')

s = scrape()   # get initial value

while True:
    time.sleep(sleeptime)         # sleep before testing again
    if s.text == scrape().text:   # compare the text of bs
        print('Nothing Changed')
    else:
        print("Something Changed!")
        break

Doku: https://docs.python.org/3/tutorial/controlflow.html#defining-functions Doku: https ://docs.python.org/3/tutorial/controlflow.html#defining-functions

[...] The return statement returns with a value from a function. [...] return语句从函数返回值。 return without an expression argument returns None . 不带表达式参数的return返回None Falling off the end of a function also returns None . 从函数末尾掉落也会返回None

Additional to the 'return' answer: You must declare (and initialize) the variable in the correct scope. 除“返回”答案外:您必须在正确的范围内声明(并初始化)变量。 If you first assign it inside the function it will stay in this scope. 如果您首先在函数内部分配它,它将保留在此范围内。 Assign it outside and then use the return result to compare it. 将其分配到外部,然后使用返回结果进行比较。

from bs4 import BeautifulSoup
import requests
import time


sleeptime = 15
output = ""

def scrape():
    url = "http://www.pythonforbeginners.com"
    source_code = requests.get(url)
    plain_text = source_code.text
    # Use the correct API call to get the string you want to compare
    return BeautifulSoup(plain_text, 'html.parser').to_string()

while 1:
    new_output = scrape() 
    if output == new_output:
        print('Nothing Changed')
    else:
        print("Something Changed!")
        # change output to new output
        output = new_output
    time.sleep(sleeptime)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM