简体   繁体   English

HTML压缩

[英]HTML compression

Most web pages are filled with significant amounts of whitespace and other useless characters which result in wasted bandwidth for both the client and server. 大多数网页上都充满了大量的空格和其他无用的字符,这会浪费客户端和服务器的带宽。 This is especially true with large pages containing complex table structures and CSS styles defined at the level. 对于包含复杂表结构和在该级别定义的CSS样式的大页面而言,尤其如此。 It seems like good practice to preprocess all your HTML files before publishing, as this will save a lot of bandwidth, and where I live, bandwidth aint cheap. 在发布之前对所有HTML文件进行预处理似乎是一种好习惯,因为这将节省大量带宽,而且我住的地方带宽也不便宜。

It goes without saying that the optimisation should not affect the appearance of the page in any way (According to the HTML standard), or break any embedded Javascript or backend ASP code, etc. 不用说,优化不会以任何方式(根据HTML标准)影响页面外观,也不会破坏任何嵌入式Javascript或后端ASP代码等。

Some of the functions I'd like to perform are: 我想执行的一些功能是:

  • Removal of all whitespace and carriage returns. 删除所有空格和回车符。 The parser needs to be smart enough to not strip whitespace from inside string literals. 解析器必须足够聪明,不能从字符串文字内部剥离空格。 Removal of space between HTML elements or attributes is mostly safe, but iirc browsers will render the single space between div or span tags, so these shouldn't be stripped. 删除HTML元素或属性之间的空格大部分是安全的,但是iirc浏览器将在div或span标签之间呈现单个空格,因此不应删除这些空格。
  • Remove all comments from HTML and client side scripts 从HTML和客户端脚本中删除所有注释
  • Remove redundant attribute values. 删除多余的属性值。 eg <option selected="selected"> can be replaced with <option selected> 例如, <option selected="selected">可以替换为<option selected>

As if this wasn't enough, I'd like to take it even farther and compress the CSS styles too. 好像还不够,我想进一步扩展它并压缩CSS样式。 Pages with large tables often contain huge amounts of code like the following: <td style="TdInnerStyleBlaBlaBla"> . 具有大表的页面通常包含大量的代码,如下所示: <td style="TdInnerStyleBlaBlaBla"> The page would be smaller if the style label was small. 如果样式标签较小,则页面将较小。 eg <td style="x"> . 例如<td style="x"> To this end, it would be great to have a tool that could rename all your styles to identifiers comprised of the least number of characters possible. 为此,拥有一个可以将所有样式重命名为包含尽可能少的字符的标识符的工具将是很棒的。 If there are too many styles to represent with the set of allowable single digit identifiers, then it would be necessary to move to larger identifiers, prioritising the smaller identifiers for the styles which are used the most. 如果有太多样式无法用一组允许的一位数字标识符表示,则有必要转向较大的标识符,对使用最多的样式优先使用较小的标识符。

In theory it should be quite easy to build a piece of software to do all this, as there are many XML parsers available to do the heavy lifting. 从理论上讲,构建一个软件来完成所有这些工作应该很容易,因为有许多XML解析器可以完成繁重的工作。 Surely someone's already created a tool which can do all these things and is reliable enough to use on real life projects. 肯定有人已经创建了一个工具,可以执行所有这些操作,并且足够可靠,可以在现实生活中使用。 Does anyone here have experience with doing this? 这里有人有这样做的经验吗?

The term you're probably after is 'minify' or 'minification'. 您可能会使用的术语是“最小化”或“最小化”。

This is very similar to an existing conversation which you may find helpfull: 这与现有对话非常相似,您可能会发现有帮助:

https://stackoverflow.com/questions/728260/html-minification https://stackoverflow.com/questions/728260/html-minification

Also, depending on the web server you use and the browser used to look at your site, it is likely that your server is already compressing data without you having to do anything: 另外,根据您使用的Web服务器和用于查看站点的浏览器,您的服务器可能已经在压缩数据而无需执行任何操作:

http://en.wikipedia.org/wiki/HTTP_compression http://en.wikipedia.org/wiki/HTTP_compression

your 3 points are actually called "Minimizing HTML/JS/CSS" 您的3点实际上称为“最小化HTML / JS / CSS”

Can have a look these: 可以看看这些:

I have done some compression HTML/JS/CSS too, in my personal distributed crawler. 我也在个人分布式搜寻器中完成了一些压缩HTML / JS / CSS。 which use gzip, bzip2, or 7zip 使用gzip,bzip2或7zip

  • gzip = fastest, ~12-25% original filesize gzip =最快,原始文件大小约为12-25%
  • bzip2 = normal, ~10-20% original filesize bzip2 =正常,原始文件大小约为10-20%
  • 7zip = slow, ~7-15% original filesize 7zip =慢,原始文件大小约为〜7-15%

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM