简体   繁体   中英

How to keep OWASP HTML sanitizer from limiting line length?

I have to put several 100000 very old html documents into a web application. I saw great effects while using the OWASP HTML Sanitizer and was able to ensure that properly sanitized HTML is created. My only problem is that HTML Sanitizer puts a hard limit on the maximum line length. To be exact this is a maximum of 250 byte per line. Unfortunately this has the effect that some words get split in the middle and this is the same with the displayed html (marked with a caret):

This sentence here is perfectly ok. But in the next s entence there is an additional space in the word "sentence".

                                                     ^

How can I tell the sanitizer not to end the lines too soon ?

As some of the lines from the originary html are 800 byte or more it would also help if I were able to tell the sanitizer only to insert breaks in whitespace.

这不是一个答案,而是一个忏悔:截断行的效果是由我的代码的其他部分引起的,它对输出设置了行长度限制。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM