import asyncio
import pyppeteer
import logging
from pyppeteer import launch
pyppeteer.DEBUG = True
for name in logging.root.manager.loggerDict:
logging.getLogger(name).disabled = True
async def main():
browser = await launch(headless = False)
page = await browser.newPage()
await page.setJavaScriptEnabled(True)
response = await page.goto('http://www.africau.edu/images/default/sample.pdf',
time = 3000, waitUntil = ['domcontentloaded', 'load', 'networkidle0'])
content = await response.buffer()
print(content)
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
expected output: content of http://www.africau.edu/images/default/sample.pdf
got output: b'df48fcc4-a0b0-4e86-b52e-0ec012ee791e'
Python 3,Linux Ubuntu
I'd suggest using pyppdf it's a Python port of the Puppeteer.
conda install -c defaults -c conda-forge pyppdf
OR
pip install pyppdf
it has a handy function save_pdf
def save_pdf(output_file: str=None, url: str=None, html: str=None, args_dict: Union[str, dict]=None, args_upd: Union[str, dict]=None, goto: str=None, dir_: str=None) -> bytes:
or you could simply just
await page.screenshot({'path': 'ss.png'})
await page.pdf({'path': 'sample.pdf'})
I'm aware that you are asking for a solution using pyppeteer , but honestly this can be done way easier with requests .
import requests
def main():
r = requests.get("http://www.africau.edu/images/default/sample.pdf")
with open("sample.pdf", "wb") as file:
file.write(r.content)
if __name__ == "__main__":
main()
That's all your file will be saved in a file called sample.pdf
.
As many people answered and I tried myself as well.
To reply exactly to asked question "Is it possible to get pdf page using pyppeteer?", the answer is No .
You can try to use headless = True
, but still without success.
You can save opened page as a pdf, but You can't store requested pdf using pypeteer directly and access it's content from response.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.