简体   繁体   中英

Same repeated output for loop in Python 3.6

I am scraping pages of 7 domains out of 8. I get the output I want but - for some reason - the same output is generated 7 times instead of just once. The simplified code is here:

    def firstpage(pp):
        city = [0, 1, 2, 3, 4, 5, 6, 7]
        p1 = []
        pp = pd.DataFrame()
        
        for i in city:
            response = i
            
            if response > 0:
                p = ['a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 
    'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23']
                for a in p:
                    page = str(a)
                    page = 'https://www.uno.com/' + str(i) + '/' + page
                    p1.append(page)
            else:
                print("error")
        
        pp = pd.DataFrame(p1)
        pp.columns = ['Links']
        pp.to_csv('Test.csv', sep=',')

        return pp 
    
    AllFirstPages = pd.DataFrame()
    %timeit firstpage(AllFirstPages)

I tried also with the pp block right after the p1.append(page)

The same thing is happening: the output is correct but it is running through the loop multiple times, which makes it inefficient.

The correct output is

输出

What I am doing wrong? Why is the loop going 6 times more giving the same output?

I am thinking to have the pandas dataframe outside the loop but how do I do that in the function?

Thanks!

I think you are getting confused writing a function with no input parameter (you aren't using the "pp" parameter as an input in your function), then trying to force it outside of the function. Other than some strange design choices, your code works fine like this:

def firstpage():
    city = [0, 1, 2, 3, 4, 5, 6, 7]
    p1 = []
    
    for i in city:
        response = i
        
        if response > 0:
            p = ['a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 
            'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23']
            for a in p:
                page = str(a)
                page = 'https://www.uno.com/' + str(i) + '/' + page
                p1.append(page)
        else:
            print("error")
    return p1

print(firstpage())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM