简体   繁体   English

“ return”在循环的第一次迭代后退出该函数

[英]`return` exits the function after the first iteration of the loop

I know I am missing something really small concept here. 我知道我在这里错过了一个非常小的概念。

Here is what I am trying to do: - Return all titles in the file with "*.html" extensions in the directory. 这是我要执行的操作:-返回目录中带有“ * .html”扩展名的文件中的所有标题。

However, the function I wrote generated only first files title. 但是,我编写的函数仅生成第一个文件标题。 But if I use "print" it prints all. 但是,如果我使用“打印”,它将全部打印。

def titles():
    for file_name in glob.glob(os.path.join(dir_path, "*.html")):
        with open(file_name) as html_file:
            soup = BeautifulSoup(html_file)
            return str(soup.title.get_text().strip())
titles()

Return exits within the function, giving you only the result of the first iteration. 函数内的Return出口,仅给您第一次迭代的结果。 Once the function returns, control is passed back to the caller. 函数返回后,控制权将传递回调用方。 It does not resume. 它不会恢复。

As a solution, you have 2 options. 作为解决方案,您有2个选择。

Option 1 (recommended for a large amount of data): Change return to yield . 选项1 (建议使用大量数据):将return更改为yield Using yield converts your function into a generator from which you can loop across its return values: 使用yield可以将函数转换为生成器,从中可以循环返回其返回值:

def titles():
    for file_name in glob.glob(os.path.join(dir_path, "*.html")):
        with open(file_name) as html_file:
            soup = BeautifulSoup(html_file)

        yield soup.title.get_text().strip() # yield inside the loop, happens multiple times

for s in titles():
    print(s)

Option 2: Store all your output in a list and return the list at the end: 选项2:将所有输出存储在列表中,并在最后返回列表:

def titles():
    data = []
    for file_name in glob.glob(os.path.join(dir_path, "*.html")):
        with open(file_name) as html_file:
            soup = BeautifulSoup(html_file)
        data.append(soup.title.get_text().strip())

    return data # return outside the loop, happens once

print(titles())

You have two choices. 您有两种选择。 Either add each result to a local data structure (say, a list) in the loop and return the list after the loop; 将每个结果添加到循环中的本地数据结构(例如列表)中,然后在循环后返回列表; or create this function to be a generator and yield on each result in the loop (no return). 或创建此函数作为生成器并在循环中的每个结果上屈服(不返回)。

The return approach is ok for smaller data sets. 对于较小的数据集,返回方法是可以的。 The generator approach is more friendly or even necessary for larger data sets. 对于较大的数据集,生成器方法更为友好,甚至是必需的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM