简体   繁体   English

带有 Unicode 字符的正则表达式

[英]Regular expression with Unicode Characters

I need help with a regular expression, I need to catch the characters 🥇, ⚡,❓ and others in a string.我需要正则表达式方面的帮助,我需要在字符串中捕捉字符🥇、⚡、❓ 和其他字符。 If this character is at the beginning of the line and without a space, insert a space on the right, if the character is at the end of the line and without a space in front of it, insert a space on the left.如果该字符位于行首且没有空格,则在右侧插入一个空格,如果该字符位于行尾且其前面没有空格,则在左侧插入一个空格。 And if the character is in the middle of the line without spaces, insert spaces on the left and right.如果字符在行中间没有空格,则在左右插入空格。 So far, all I can do is get the character itself ' /[\x{000ff}-\x{fffff}]/u '到目前为止,我所能做的就是获取角色本身 ' /[\x{000ff}-\x{fffff}]/u '

Right, it's lengthy but I interpreted your question as such:是的,它很长,但我这样解释你的问题:

  • Keep leading/trailing spaces intact;保持前导/尾随空格完整;
  • Keep consecutive emoticons intact.保持连续的表情完整。

Therefor, try:因此,尝试:

(?:(?<![\x{000ff}-\x{fffff}\s]|^)(?=[\x{000ff}-\x{fffff}])|(?<=[\x{000ff}-\x{fffff}])(?!$|[\x{000ff}-\x{fffff}\s]))

See an online demo查看在线演示

It's lengthy, but it's basically an non-capture group which holds two alternatives;它很长,但它基本上是一个非捕获组,有两种选择;

  • Match a position that is not preceded by emoticon, start-line or space but is followed by emoticon;匹配前面没有表情、起始行或空格但后面有表情的位置;
  • Match a position that is not followed by emoticon, end-line or space but is preceded by emoticon.匹配后面没有表情符号、结束行或空格但前面有表情符号的位置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM