简体   繁体   中英

Python-docx - insert picture into docx from URL

I am trying to grab an image hosted on a website(eg. imgur) and add it to a docx.

This is my initial code(this is part of a function. I've stripped it down to the relevant codes):

from PIL import Image
from urllib.request import urlopen

thisParagraph = document.sections[0].paragraphs[0]
run = thisParagraph.add_run()

# imgLink is a direct link to the image. Something like https://i.imgur.com/<name>.jpg
# online is a parsed-in boolean to determine if the image link is from an image hosting site
# or from the local machine
if (online):
   imgLinkData = urlopen(imgLink )
   img = Image.open(imgLinkData )
   width, height = img.size
else:
   img = Image.open(imgLink )
   width, height = img.size
   imgLinkData = imgLink 

if (width > 250) or (height > 250):
   if (height > width):
       run.add_picture(imgLinkData, width=Cm(3), height=Cm(4) )
   else:
       run.add_picture(imgLinkData, width=Cm(4), height=Cm(3) )
else:
       run.add_picture(imgLinkData)

For the most part, this works if imgLink is pointed to my local system(ie. the image is hosted on my PC).

But if I refer to a url link(online=True), I get various types of exceptions(in my attempt to fix it) ranging from io.UnsupportOperation (seek) to TypeError (string argument expected, got 'bytes'), the cause is always the run.add_picture line.

The code, as it is now, throws the io.UnsupportOperation exception.

Save the image to a file and then use the file path as the first argument to .add_picture() . This would be something roughly like:

img.save("my-image.jpg")
run.add_picture("my-image.jpg", width=Cm(3), height=Cm(4))

As an alternative, you could create an "in-memory" file ( io.BytesIO ) containing the image and use that. This second approach has the advantage of not requiring access to a filesystem.

import io
image_stream = io.BytesIO(imgLinkData)
run.add_picture(image_stream, width=Cm(3), height=Cm(4))

The interface to Document.add_picture() expects a str path or a file-like object (open file or in-memory file) as its first argument: https://python-docx.readthedocs.io/en/latest/api/document.html#docx.document.Document.add_picture

Think I may have solved the issue.

Based on this link , I made some slight modifications to my code.

I added:

import requests, io

Then I changed:

imgLinkData = urlopen(imgLink )

to

imgLinkData= io.BytesIO(requests.get(imgLink ).content )

And this seems to have successfully generated the image in my docx document, though I'm not exactly sure why, aside from the fact that the urlopen returned

<class 'http.client.HTTPResponse'>

and the requests.get returned

<class 'requests.models.Response'>

and.content returned a

<class 'bytes'>

object.

Further reading even seems to indicate against using urllib

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM