简体   繁体   English

正则表达式将BBCode分为几部分

[英]Regex to split BBCode into pieces

I have this: 我有这个:

str = "some html code [img]......[/img] some html code [img]......[/img]"

and I want to get this: 我想得到这个:

["[img]......[/img]","[img]......[/img]"]

Please don't use BBCode. 请不要使用BBCode。 It's evil . 这是邪恶的

BBCode came to life when developers were too lazy to parse HTML correctly and decided to invent their own markup language. 当开发人员懒得无法正确解析HTML并决定发明自己的标记语言时,BBCode诞生了。 As with all products of laziness, the result is completely inconsistent, unstandardized, and widely adopted. 与所有懒惰产品一样,结果完全不一致,不规范且被广泛采用。

Try to use a user-friendlier markup language, like Markdown (that's what Stack Overflow uses) or Textile . 尝试使用用户友好的标记语言,例如Markdown (这是Stack Overflow所使用的)或Textile Both of them have parsers for Ruby: 他们两个都有Ruby的解析器:


If you still don't want to heed to my advice and choose to go with BBCode, don't reinvent the wheel and use a BBCode parser . 如果您仍然不想听我的建议并选择使用BBCode,请不要重新发明轮子并使用BBCode解析器 To answer your question directly, there is the least desirable option: use regex. 要直接回答您的问题,最不希望的选择是:使用正则表达式。

/\[img\].*?\[\/img\]/

As seen on rubular . 如在红宝石看到的 Although I would use /\\[img\\](.*?)\\[\\/img\\]/ , so it will extract the contents inside the img tags. 尽管我会使用/\\[img\\](.*?)\\[\\/img\\]/ ,所以它将提取img标记内的内容。 Note that this is fairly fragile and will break if there are nested img tags. 请注意,这是相当脆弱的,如果存在嵌套的img标签,它将破坏。 Hence, the advice to use a parser. 因此,建议使用解析器。

irb(main):001:0> str = "some html code [img]......[/img] some html \
code [img]......[/img]"
"some html code [img]......[/img] some html code [img]......[/img]"
irb(main):002:0> str.scan(/\[img\].*?\[\/img\]/)
["[img]......[/img]", "[img]......[/img]"]

Keep in mind that this is a very specific answer that is based on your exact question. 请记住,这是一个非常具体的答案,它基于您的确切问题。 Change str by, say, adding an image tag within an image tag , and all Hell will break loose . 例如,通过在图像标签中添加图像标签来更改str所有地狱都会崩溃

There is a ruby BBCODE parser at Google Code. Google Code中有一个ruby BBCODE解析器

Don't use regex for this. 不要为此使用正则表达式。

str = "some html code [img]......[/img] some html code [img]......[/img]"
p str.split("[/img]").each{|x|x.sub!(/.*\[img\]/,"")}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM