Why does GMail break inline (CSS) styles for an email with minified markup/HTML using quote-printable encoding?

Question

I had an email that was getting clipped by GMail (around 120Kb in size — according to http://www.adestra.com/avoid-gmail-clipping-emails/ , GMail starts clippping the messages at 102Kb)

GMail剪辑电子邮件

To reduce the email size, I decided to try out both django-htmlmin and plain ol' re.sub(r'\\n\\s*(\\S)', r'\\1', email_html_content) (Modifying markup using regex is a discussion for another day). Both these techniques resulted in 30%+ reduction in email size, but both solutions broke the rendering in GMail. When inspecting the broken design using Dev Tools, it seems some of the elements are not getting any inline styles, apparently at random.

GMail中的元素，没有任何内联样式

However, when I clicked on 'Show Original' to view the raw email, I'm seeing inline styles for the element.

原始电子邮件中可见的内联样式

When checking the raw email, I saw that it is encoded using the quote-printable format. Which means, even though my entire email is just 1 line when minified, line breaks ( = in quote-printable format) are inserted automatically, as is visible in the picture above. Some of these line breaks ( = characters at the end of the line) appear mid-attribute values, but the email client seems to be ignoring these breaks and I don't think they are cause of the broken rendering (even my original email with unminified markup had such line breaks, and it was being rendered fine — according to my reading, it the email spec that suggests lines have a maximum limit of 78(?) characters).

Another pattern I saw in the raw email code was after a chunk of lines (each delimited by a = ), there seems to be a paragraph that is delimited by a =0D character. Each paragraph is not the same size and I'm cannot find any source on why these characters are being inserted in a one-line, minified markup email. The paragraph pattern can be seen in the image below:

由<code> = 0D </ code>字符分隔的随机段落

Even this character is appearing mid-attribute value for certain tags and I think this might be the reason the rendering is breaking. I got the rendering working again by using re.sub(r'\\n\\s*(\\S)', r'\\n\\1', email_body) instead of re.sub(r'\\n\\s*(\\S)', r'\\1', email_body) — ie each tag (opening or closing) on a separate line instead of mashing up everything in just one line. This increased the size of the email, but got rid of the =0D characters from appearing mid-attribute value. Now it is at the end of each line and the email is rendering fine.

每个标签在单独的行上

So, my question is, how do I minify my email HTML and still produce an unbroken rendering within email clients? What is causing the broken rendering and how may I go about fixing it?

Answer 1

I stumbled upon exactly the same problem, although I wasn't using any custom regex patterns for minifying html.

Turns out, the problem is that some html parsers (including gmail) have limitations for the line length of the html that is being sent.

This article explains nicely what's going on. Just like you noticed the html parsers have their own ways of splitting long lines and creating new line breaks. And these ways fail.

So the solution is to minify you html emails in a way that keeps all your lines not longer than a specific length.

I've successfully used this html-minifier (actually I used gulp-htmlmin which uses the "html-minifier") passing { maxLineLength: 996 } as an option.

The emails are no longer broken :)

Answer 2

There are many things that don't work in one or another email client and unfortunately this is much more common than in the browser world. They are not supported as security precautions or because email client developers were lazy to implement their support.

I recommend to read https://www.campaignmonitor.com/css/ as a starting point to ensure that your email is going to be rendered correctly.

Answer 3

I think I see what is causing your problems.

In your file you have carriage return line feed line endings.

By doing re.sub(r'\\n\\s*(\\S)', r'\\1', email_html_content) you remove the line feeds but leave carriage returns intact, which in return get encoded in quoted printable encoding (=0D). These chars are causing your problems. This is also why using re.sub(r'\\n\\s*(\\S)', r'\\n\\1', email_body) works. The problems are caused by sole carriage return signs while carriage return - line feed pairs work ok.

If I understand your intention correctly to remove unnecessary white space you should modify the code to strip carriage return signs as well:

re.sub(r'\r\n\s*(\S)', r'\1', email_html_content)

This should reduce space and will not cause problems with interpreting the css file.

With that said, wouldn't it be better to improve the regex to something like that:

re.sub(r'\s*\r\n\s*', r'', email_html_content)

This works in this way: find any number of white space, carriage return line feed pair and any number of white space, and remove them, instead of find carriage return line feed pair, any number of white space and one non white space char and replace it with the char.

Why does GMail break inline (CSS) styles for an email with minified markup/HTML using quote-printable encoding?

Question

3 answers

solution1
1 2015-04-09 13:52:06

solution2
0 2014-11-25 21:39:10

solution3
0 2014-11-26 12:05:29

Why does GMail break inline (CSS) styles for an email with minified markup/HTML using quote-printable encoding?

Question

3 answers

solution1 1 2015-04-09 13:52:06

solution2 0 2014-11-25 21:39:10

solution3 0 2014-11-26 12:05:29

solution1
1 2015-04-09 13:52:06

solution2
0 2014-11-25 21:39:10

solution3
0 2014-11-26 12:05:29