简体   繁体   中英

lxml cleaner with a custom tag?

I want to use lxml cleaner to get rid of all html, but then a regex to autolink something:

[ABC] -> <a href="bah bah bah">ABC</a>

what is the right way to handle this without xss and such?

Maybe using markdown with inline HTML disabled would be suitable? The python markdown module is quite mature.

Check out the "safe mode" section in the docs for more info on stripping out inline HTML.

Depending on what you want, something like py-wikimarkup may be more appropriate.

Using a custom regexp is probably not a great idea, because

  • you'll have to explain the rules to people who might already be familiar with markdown/WikiText
  • you'll have to provide a way to escape text, eg for people who really want to write [ABC]
  • you'll have to fix any bugs, including security issues

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM