Python urlretrieve downloading corrupted images

Question

I am downloading a list of images (all .jpg) from the web using this python script:

__author__ = 'alessio'

import urllib.request

fname = "inputs/skyscraper_light.txt"

with open(fname) as f:
    content = f.readlines()


for link in content:
    try:
        link_fname = link.split('/')[-1]
        urllib.request.urlretrieve(link, "outputs_new/" + link_fname)
        print("saved without errors " + link_fname)
    except:
        pass

In OSX preview I see the images just fine, but I can't open them with any image editor (for example Photoshop says "Could not complete your request because Photoshop does not recognize this type of file."), and when i try to attach them to a word document, the files are not even showed as picture files in the dialog for browsing for image.

What am i doing wrong?

Answer 1

As JF Sebastian suggested me in the comments, the issue was related to the newline in the filename.

To make my script work, you need to replace

link_fname = link.split('/')[-1]

with

link_fname = link.strip().split('/')[-1]

Python urlretrieve downloading corrupted images

Question

1 answers

solution1
0 ACCPTED 2015-02-13 06:39:15

Python urlretrieve downloading corrupted images

Question

1 answers

solution1 0 ACCPTED 2015-02-13 06:39:15

solution1
0 ACCPTED 2015-02-13 06:39:15