简体   繁体   中英

Failed to download images via url in Python

I have found the address of the image in the source code of the web page.
Related content are shown as follows:

<div class="fwr_page_box">
    <div class="fwr_page" id="PageContainer_0" style="width: 1200px; height: 1696px; margin-left: 815px;">
        <div id="Wrap_0" class="fwr_page_wrap border  fwr_hidden" style="width: 1200px; height: 1696px;"></div>
        <div class="loadingBg" id="loadingBg0" style="width:1200;height:1696;">
            <img alt="" src="http://162.105.134.188/store/z6MY4xILLZ4Adov3uF7aOQ11/P01_00001.jpg" id="ViewContainer_BG_0" class="border  fwr_page_bg_image">
        </div>
    </div>
</div>

Then, I can extract the url ( http://162.105.134.188/store/z6MY4xILLZ4Adov3uF7aOQ11/P01_00001.jpg ) with chrome browser, and then download it manually.The figure size is about 87 kb.

However, when I tried to batch download those images via wget or python urllib

end_page = 117
for i in range(0,end_page,1):
    os.system("wget http://162.105.134.188/store/z6MY4xILLZ4Adov3uF7aOQ11/P01_%s.jpg" %"{:05d}".format(i))

Those files can be downloaded, while the figure size are only 82 bit without any content.

Dynamic html with the images are loaded by Javascript, which isn't loaded by wget or urllib

Use selenium to simulate a Chrome browser, and extract content from there

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM