Incomplete download using requests in Python

Question

I'm following an online project where we predict the air quality index. In order to do that, we need to get the data first, which we are downloading from a website. Here is the source code provided by the author:

import os
import time
import requests
import sys

def retrieve_html():
    for year in range(2013,2019):
        for month in range(1,13):
            if(month<10):
                url='http://en.tutiempo.net/climate/0{}-{}/ws-421820.html'.format(month
                                                                          ,year)
            else:
                url='http://en.tutiempo.net/climate/{}-{}/ws-421820.html'.format(month
                                                                          ,year)
            texts=requests.get(url)
            text_utf=texts.text.encode('utf=8')
            
            if not os.path.exists("Data/Html_Data/{}".format(year)):
                os.makedirs("Data/Html_Data/{}".format(year))
            with open("Data/Html_Data/{}/{}.html".format(year,month),"wb") as output:
                output.write(text_utf)
            
        sys.stdout.flush()
        
if __name__=="__main__":
    start_time=time.time()
    retrieve_html()
    stop_time=time.time()
    print("Time taken {}".format(stop_time-start_time))

This works perfectly fine. Now, I tried writing the same code on my own. Here is my code:

import os
import time
import requests
import sys


def retrieve_html():
    for year in range(2013,2019):
        for month in range(1,13):
            if(month<10):
                url='http://en.tutiempo.net/climate/0{}-{}/ws-421820.html'.format(month, year)
            else:
                url='http://en.tutiempo.net/climate/{}-{}/ws-421820.html'.format(month, year)
        
        texts=requests.get(url)
        text_utf=texts.text.encode("utf=8")
        
        if not os.path.exists("Data/Html_Data/{}".format(year)):
            os.makedirs("Data/Html_Data/{}".format(year))
        
        with open("Data/Html_Data/{}/{}.html".format(year,month),"wb") as output:
            output.write(text_utf)
            
    sys.stdout.flush()
        
if __name__=="__main__":
    start_time=time.time()
    retrieve_html()
    stop_time=time.time()
    print("Time taken: {}".format(stop_time-start_time))

But whenever I run this script, only the data from the 12th month is getting downloaded and the rest of the data from the other months aren't getting downloaded. I checked using the code provided by the author and it is working perfectly fine, although my code is exactly the same as his. This is driving me crazy. Can anyone please point out where I am going wrong?

Answer 1

Its not exactly the same, there are different indentations:

Answer 2

Well, you should indent this:

        texts=requests.get(url)
        text_utf=texts.text.encode("utf=8")
        
        if not os.path.exists("Data/Html_Data/{}".format(year)):
            os.makedirs("Data/Html_Data/{}".format(year))
        
        with open("Data/Html_Data/{}/{}.html".format(year,month),"wb") as output:
            output.write(text_utf)

Answer 3

Code is correct only there is indentation problem. The following code should be in the inner for loop

texts=requests.get(url)
text_utf=texts.text.encode("utf=8")
        
if not os.path.exists("Data/Html_Data/{}".format(year)):
   os.makedirs("Data/Html_Data/{}".format(year))
        
   with open("Data/Html_Data/{}/{}.html".format(year,month),"wb") as output:
        output.write(text_utf)

And the following code should be in the outer for loop

sys.stdout.flush()

Incomplete download using requests in Python

Question

3 answers

solution1
1 ACCPTED 2020-07-14 13:18:54

solution2
1

solution3
1 2020-07-14 13:19:15

Incomplete download using requests in Python

Question

3 answers

solution1 1 ACCPTED 2020-07-14 13:18:54

solution2 1

solution3 1 2020-07-14 13:19:15

solution1
1 ACCPTED 2020-07-14 13:18:54

solution2
1

solution3
1 2020-07-14 13:19:15