简体   繁体   English

正则表达式用于查找和替换冒号中的表情符号名称

[英]Regex to find and replace emoji names within colons

I'm trying to write a regex (for JavaScript's regex engine) that I can use to do a find and replace in text for emoji names within colons. 我正在尝试编写一个正则表达式(用于JavaScript的正则表达式引擎),我可以用它来查找并替换冒号中表情符号名称的文本。 Like in Slack or Discord when you type :smiley-face: and it replaces it when you submit the chat. 当你输入时,就像在Slack或Discord中一样:smiley-face:当你提交聊天时它会替换它。 I'm targeting text nodes only so I don't need to worry about other html inside the text. 我只针对文本节点,所以我不需要担心文本中的其他html。

Is it possible to write a regex that could match all of the following rules? 是否可以编写一个可以匹配以下所有规则的正则表达式? (text highlighted with monospace blocks = regex positive matches) (用monospace blocks突出显示的文本=正则表达式正匹配)

:any-non-whitespace:
:text1: sample2: :text1: sample2:
:@(1@#$@SD: :s: :@(1@#$@SD: :s:
:nospace::inbetween: because there are 2 colons in the middle :nospace::inbetween:因为:nospace::inbetween:有2个冒号
:nospace: middle :nospace: :nospace: middle :nospace:

I'm starting with something like this but it's incomplete 我从这样的事情开始,但它不完整

/:(?!:)\S+:/gim

I'm trying to think of all the special cases that might possibly occur doing this. 我试图想一想这可能发生的所有特殊情况。 Maybe I'm overthinking it. 也许我正在思考它。

There's a lot of Twitch emotes involved so I can't use emoji unicode characters. 有很多Twitch表达,所以我不能使用表情符号unicode字符。 The regex will find matches and replace with tags 正则表达式将找到匹配项并替换为标记

I suggest using 我建议使用

:[^:\s]*(?:::[^:\s]*)*:

See the regex demo . 请参阅正则表达式演示 It is the same pattern as :(?:[^:\\s]|::)*: , but a bit more efficient because the (?:..|...)* part is unrolled . 它与:(?:[^:\\s]|::)*: ,但效率更高,因为(?:..|...)*部分已展开

Details 细节

  • : - a colon : - 冒号
  • [^:\\s]* - 0+ chars other than : and whitespace [^:\\s]* - 0+以外的字符:和空格
  • (?: - start of a quantified non-capturing group: (?: - 开始量化的非捕获组:
    • :: - double colon :: - 双冒号
    • [^:\\s]* - 0+ chars other than : and whitespace [^:\\s]* - 0+以外的字符:和空格
  • )* - end of grouping, repeated 0 or more times (due to the * quantifier) )* - 分组结束,重复0次或更多次(由于*量词)
  • : - a colon. : - 冒号。

My first thought was 我的第一个想法是

:(::|[^:\n])+:

It matches a string, at least one character long, including surrounding colons, that consists of either 它匹配一个字符串,至少一个字符长,包括周围的冒号,由两者组成

  • two colons ( :: ), or 两个冒号( :: ,或
  • a character that isn't a colon, nor a line feed. 不是冒号的字符,也不是换行符。

But that's basically what Wiktor had as a (slower) alternative (comments). 但这基本上是Wiktor作为(较慢)替代品(评论)所具有的。 But I'll leave it here anyway since it's working, as opposed to the other submitted answers ;) 但是我会把它留在这里,因为它正在工作,而不是其他提交的答案;)

See it here at regex101 . 在regex101上查看

Do you want something like this regex? 你想要这样的正则表达式吗?

(:(?![\n])[()#$@-\w]+:)

Demo ,,, in which you can additionally insert unallowed characters into the character class of the (?![\\n]) and also additonally insert allowed characters into the character class [()#$@-\\w] 演示 ,,,其中你可以另外将unallowed characters插入(?![\\n])的字符类中,并且还allowed charactersallowed characters插入字符类[()#$@-\\w]

试试这个regx

/(^|\\s)+:([^\\s\\n\\r])+:|^:[^\\s\\n\\r]+/g

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM