简体   繁体   English

Python 3.6 中循环的相同重复输出

[英]Same repeated output for loop in Python 3.6

I am scraping pages of 7 domains out of 8. I get the output I want but - for some reason - the same output is generated 7 times instead of just once.我正在从 8 个域中抓取7 个域的页面。我得到了我想要的输出,但是——出于某种原因——相同的输出生成了 7 次而不是一次。 The simplified code is here:简化的代码在这里:

    def firstpage(pp):
        city = [0, 1, 2, 3, 4, 5, 6, 7]
        p1 = []
        pp = pd.DataFrame()
        
        for i in city:
            response = i
            
            if response > 0:
                p = ['a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 
    'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23']
                for a in p:
                    page = str(a)
                    page = 'https://www.uno.com/' + str(i) + '/' + page
                    p1.append(page)
            else:
                print("error")
        
        pp = pd.DataFrame(p1)
        pp.columns = ['Links']
        pp.to_csv('Test.csv', sep=',')

        return pp 
    
    AllFirstPages = pd.DataFrame()
    %timeit firstpage(AllFirstPages)

I tried also with the pp block right after the p1.append(page)我也在p1.append(page)之后p1.append(page)尝试使用 pp 块

The same thing is happening: the output is correct but it is running through the loop multiple times, which makes it inefficient.同样的事情正在发生:输出是正确的,但它在循环中运行了多次,这使得它效率低下。

The correct output is正确的输出是

输出

What I am doing wrong?我做错了什么? Why is the loop going 6 times more giving the same output?为什么循环会多 6 倍给出相同的输出?

I am thinking to have the pandas dataframe outside the loop but how do I do that in the function?我正在考虑将 Pandas 数据框放在循环之外,但我该如何在函数中做到这一点?

Thanks!谢谢!

I think you are getting confused writing a function with no input parameter (you aren't using the "pp" parameter as an input in your function), then trying to force it outside of the function.我认为您在编写没有输入参数的函数时会感到困惑(您没有在函数中使用“pp”参数作为输入),然后试图将其强制执行到函数之外。 Other than some strange design choices, your code works fine like this:除了一些奇怪的设计选择之外,您的代码可以像这样正常工作:

def firstpage():
    city = [0, 1, 2, 3, 4, 5, 6, 7]
    p1 = []
    
    for i in city:
        response = i
        
        if response > 0:
            p = ['a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 
            'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23']
            for a in p:
                page = str(a)
                page = 'https://www.uno.com/' + str(i) + '/' + page
                p1.append(page)
        else:
            print("error")
    return p1

print(firstpage())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM