简体   繁体   中英

Inline images in Gmail with Google Apps Script

I'm trying to take inline images from gmail messages and display them using HTML service in a web app. I'm using regex to grab the img tags from the raw content (contains base64 encoded images) and using that content to render the images. However, whenever the email has 4 or more image tags, the string "3D" is added after any "=" and the regex match returns null.

Example of img tag from email with 3 images in:

<img src="cid:ii_142faccc53cb2211" alt="Inline image 3" width="564" height="510">

Example of img tag from email with 4 images in:

<img src=3D"cid:ii_142face6aa5d8d86"= alt=3D"Inline image 2" width=3D"564" height=3D"317">

I have tried a few different regex patterns including:

<img(?:(?:.|\\n)*?)\\/?> and <img.*?>(.*?<\\/img>)? which both work for any email with 3 images in but not for 4 or more images.
What is causing the "3D" to be added and how can I work around this problem? 3D is the ASCII code for "=", which i think may have something to do with it.
Thanks

EDIT: I think the issue causing the regex to fail is related to the encoding of the string. When i get the raw content of an email with 3 or less images, it has the following line of text above the html content:
Content-Type: text/html; charset=ISO-8859-1
As soon as there is a 4th image in the email, this appears:
Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Does anyone have any experience of this and how to get around it?

Try this, it's a hacky approach but it may work. If you are getting the entire source as a string, try using javascript SPLIT on your string using =3D as the split. This will split the string into arrays using =3D as the delimiter. Then use JOIN to rejoin the arrays back into 1 string using just = as the delimiter.

arr = string.split("=3D");
newstring = arr.join("=");

I usually love regex but I've been using this method lately to strip out repeating elements in long strings I've been working with and have found it to be very efficient. It would have a drawback if =3D appears outside of your use case though.

Managed to resolve this one by manually removing some of the unnecessary '=' signs using regex and then treating the rawContent as though it had never been encoded. Was a bit of a hack and i'm still not sure why a 4th inline image causes the message to be encoded differently

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM