简体繁体中英

Downloading links with Python urllib2

原文 2014-11-18 19:52:08 3 1 python/ html/ download/ mp3/ urllib

I want to download an mp3 off of a page, but what I'm getting is just the html, and not the mp3 itself. The code I'm using is from this link here: https://stackoverflow.com/a/16518224/2137668

Why am I not able to get the mp3? Here's an example for testing that shows it gets downloaded as html: http://www5.zippyshare.com/d/77609120/61098/Cleavage%20-%20Prove%20%28Original%20Mix%29%20%5bquality-dance-music.com%5d.mp3

1 answers

When I try to open that URL in a web browser, or with wget , I get a 302 redirection to http://www5.zippyshare.com/v/77609120/file.html , which is of course an HTML page.

Many websites redirect you to such "container pages" (or just return them directly) when you browse to things like images, songs, and videos. This may be to improve your user experience, to make it harder for other sites to "deep-link" their content, or to make it harder for you to "steal" their content.

If it's one of the first two, often the answer is trivial: add a Referer header that points to the download page you got the link from (or, sometimes, to anything on the same site—even the same URL you're downloading).

If it's the third, they will usually put a lot more protection on than that. For just one example, they may require you to have a cookie that you got from sitting on the download page and waiting out a 30-second timer and that's only valid for 30 minutes.

If you understand HTTP and JavaScript well enough, and don't care about violating their terms of service, you can usually reverse-engineer each of their protections and write yourself a download script that'll work until they change things up next month, but that's usually not worth doing.

Anyway, given that this site is named zippyshare, I'm guessing it's the last of these. These kinds of sites make their money by showing you ads every time you download a file, and by prompting you to pay a monthly fee to get direct/accelerated/whatever downloads, and so on, so they will put all kinds of hurdles in the way of you downloading files directly without seeing those ads or paying that fee.

python urllib2 multi downloading

Downloading Files with Python Urllib, Urllib2

downloading Progress bar urllib2 python

python urllib, urllib2 how to get SHARP links

Python downloading PDF with urllib2 creates corrupt document

Downloading file from https page using python urllib2

Ignore missing file while downloading with Python urllib2

Python URLLib / URLLib2 POST

Python urllib2 library

Multithreading in python - urllib2

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question python urllib2 multi downloading Downloading Files with Python Urllib, Urllib2 downloading Progress bar urllib2 python python urllib, urllib2 how to get SHARP links Python downloading PDF with urllib2 creates corrupt document Downloading file from https page using python urllib2 Ignore missing file while downloading with Python urllib2 Python URLLib / URLLib2 POST Python urllib2 library Multithreading in python - urllib2

Related Tags

Downloading links with Python urllib2

Question

1 answers

solution1 1 ACCPTED 2014-11-18 20:01:20

solution1
1 ACCPTED 2014-11-18 20:01:20