简体   繁体   中英

How do I download PDF files using python's reqests/httpx module?

I'm making a program that downloads PDFs from the inte.net.

Here's a example of the code:

import httpx # <-- This also happens with the requests module


URL = "http://62.182.86.140/main/0/aee7239ffcf7871e1d6687ced1215e22/Markus%20Nix%20-%20Exploring%20Python-Entwickler%20%282005%29.djvu"
r = httpx.get(URL, timeout=20.0).content.decode("ascii")

with open(f"./example.pdf", "w") as f:
    f.write(str(content))

But when I write to a file, none of my pdf viewers (tried okular and zathura) can read them.

But when I download it using a program like wget, there's no problems.

Then when I compare the two files (one downloaded with python, and the other with wget), everything is encoded, and I can't figure out how to decode it (.decode() doesn't work).

import httpx


def main(url):
    r = httpx.get(url, timeout=20)
    with open('file.djvu', 'wb') as f:
        f.write(r.content)


main('http://62.182.86.140/main/0/aee7239ffcf7871e1d6687ced1215e22/Markus%20Nix%20-%20Exploring%20Python-Entwickler%20%282005%29.djvu')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM