简体   繁体   English

"此正则表达式模式的更优雅(更短)的解决方案"

[英]More elegant (shorter) solution for this regex pattern

I have spent three days banging my head on how to find a single solution to match anything between either single or double quotes with escaped single or doublequotes within actual source string and to replace matching text without touching targeted quotes alone .. and I think that I have succeeded.我花了三天时间研究如何找到一个单一的解决方案来匹配单引号或双引号与实际源字符串中的转义单引号或双引号之间的任何内容,并在不单独触及目标引号的情况下替换匹配的文本..我认为我成功了。 Multi-line or single-line - it works.多行或单行 - 它有效。 That is, this regex can be used to alter/change/sanitize 'text' or "text" or strings in other words, in any source code *(eg: file_get_contents ('some_class.php') ) and to leave everything else untouched, assuming that code comments are already removed before such action.也就是说,此正则表达式可用于在任何源代码 *(例如: file_get_contents ('some_class.php') )中更改/更改/清理'text'"text"或字符串,并保持其他所有内容不变,假设在此操作之前已删除代码注释。

Here is regex wrapped in singlequotes ..这是用单引号包裹的正则表达式..

'@"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"|\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'@msu'

.. and here is regex wrapped within doublequotes. .. 这是用双引号括起来的正则表达式。

"@\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"|'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'@msu"

It is perfeclty matching with source code like this ...它与这样的源代码完美匹配......

// Very nasty php array 

$Damn = [

  'a' => "' lorem ipsum '",

  'b' => '"\" ipsu\'m lorem  ',

  'c' => " \' YabadabaDooya \" ",

  'd\"' => ' 

     f"

     o\'"o  

                 \'bar" ',

  'e' => "'",

  "f" => '"'

];

Since this is working as I expect, and I am actually not a PCRE guru (don't ask how much 'pain' I've had in the past three days D: until I came up to this solution), if there's anyone who knows how, and is willing to help by shrinking the above regex into more elegant/shorter solution, that would be superb.由于这按我的预期工作,而且我实际上不是 PCRE 大师(不要问我过去三天有多少“痛苦”D:直到我想出这个解决方案),如果有人知道如何,并且愿意通过将上述正则表达式缩小为更优雅/更短的解决方案来提供帮助,那将是极好的。 I assume that |我假设| (or) in the middle of the pattern can be placed onto beginning, just once .. and I tried God only knows what .. to accomplish that, but no luck. (或)在模式的中间可以放在开头,就一次..我试过上帝只知道什么..来完成它,但没有运气。

So, the general question is - how would shorter variant of the above pattern look alike ?所以,一般的问题是 - 上述模式的较短变体看起来如何?

If you add negative look-behinds for a backslash before quotes, then it'll skip over the escaped quotes.如果您在引号前为反斜杠添加否定的后视,那么它将跳过转义的引号。

$re = '/((?<![\\\\])["\'])([\s\S]*?)((?<![\\\\])\1)/';

Test here在这里测试

I would like to thank mr.我要感谢先生。 Wahyu Kristianto<\/strong> who proposed much more elegant and smarter solution than mine. Wahyu Kristianto<\/strong>提出了比我更优雅、更智能的解决方案。

Here is his regex.这是他的正则表达式。

(["'])((?:\\\\\\1|(?:(?!\\1)).)*)(\\1)<\/code>

And it is the - perfect - one.它是-完美的-之一。

Exactly<\/strong> the thing that I was looking for.正是<\/strong>我要找的东西。 With additional regex options, it can be quite optimized and insanely performant.使用额外的正则表达式选项,它可以得到相当优化和疯狂的性能。 :) :)

Not only that, by just adding a single backtick within the first character group, the regex will match singlequotes, doublequotes and backticks as well, and that change is required on only one place.不仅如此,只需在第一个字符组中添加一个反引号,正则表达式也将匹配单引号、双引号和反引号,并且只需要在一个地方进行更改。

I think it can't be more decent and cleaner than this.我认为它不能比这更体面和清洁。 Maybe I am wrong.也许我错了。 But I doubt that.但我对此表示怀疑。

Mr. Wahyu, you're - AWESOME<\/strong> . Wahyu先生,你真棒<\/strong>。 :))) :)))

"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM