简体   繁体   English

否定字符 class 与带有 unicode 标志的 JavaScript RegExp 不匹配?

[英]Negated character class does not match for JavaScript RegExp with unicode flag?

/[^|]$/u.test('') returns true , but /\|[^|]$/u.test('|') returns false . /[^|]$/u.test('')返回true ,但/\|[^|]$/u.test('|')返回false

( ( is a Unicode SIP char ⷗4 that is represented as a surrogate pair ? in JavaScript.)是一个 Unicode SIP 字符⷗4 ,在 JavaScript 中表示为代理对? 。)

I expect that [^|] matches any single Unicode char except for a literal |我希望[^|]匹配任何单个 Unicode 字符,但文字|除外。 and thus should match因此应该匹配. . For the same reason, \|出于同样的原因, \| plus [^|] should match |加上[^|]应该匹配| plus. .

However the real world JavaScript obviously doesn't work as above (tested in Firefox 103 and Chromium 103).然而,现实世界 JavaScript 显然不能像上面那样工作(在 Firefox 103 和 Chromium 103 中测试)。 Can anyone explain why?谁能解释为什么?

Without Unicode u flag combined with i flag (that would enable Unicode case folding ), the [^|] negated character class only matches what /[\0-\x7B\x7D-�]/ matches (see Unicode-aware regular expressions in ES2015 ), and as you see, this does not match astral plane chars.如果没有 Unicode u标志与i标志相结合(这将启用Unicode 大小写折叠), [^|]取反字符 class 仅匹配/[\0-\x7B\x7D-�]/匹配的内容(参见Unicode-aware regular expressions in ES2015 ),如您所见,这与星界字符不匹配。

If you can use ES6 you can use如果你可以使用 ES6,你可以使用

 console.log( /\|[^|]$/ui.test("|") )

It appears to be a bug of the V8 JavaScript engine of Chromium according to the developers in the recent report and has been fixed which will be included in Chrome 106 onwards.根据开发人员在 最近的报告中,这似乎是 Chromium 的 V8 JavaScript 引擎的一个错误,并且已经修复,它将包含在 Chrome 106 中。

Firefox seems to implement its RegExp under some source code of V8, and will be waiting for the upstream fix. Firefox 似乎在V8 的一些源代码下实现了其RegExp,将等待上游修复。 ( ref ) 参考

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM