[英]Negated character class does not match for JavaScript RegExp with unicode flag?
/[^|]$/u.test('')
returns true , but /\|[^|]$/u.test('|')
returns false . /[^|]$/u.test('')
返回true ,但/\|[^|]$/u.test('|')
返回false 。
( (
is a Unicode SIP char
4
that is represented as a surrogate pair ?
in JavaScript.)是一个 Unicode SIP 字符
4
,在 JavaScript 中表示为代理对?
。)
I expect that [^|]
matches any single Unicode char except for a literal |
我希望
[^|]
匹配任何单个 Unicode 字符,但文字|
除外。 and thus should match因此应该匹配
.
. For the same reason,
\|
出于同样的原因,
\|
plus [^|]
should match |
加上
[^|]
应该匹配|
plus加
.
.
However the real world JavaScript obviously doesn't work as above (tested in Firefox 103 and Chromium 103).然而,现实世界 JavaScript 显然不能像上面那样工作(在 Firefox 103 和 Chromium 103 中测试)。 Can anyone explain why?
谁能解释为什么?
Without Unicode u
flag combined with i
flag (that would enable Unicode case folding ), the [^|]
negated character class only matches what /[\0-\x7B\x7D-�]/
matches (see Unicode-aware regular expressions in ES2015 ), and as you see, this does not match astral plane chars.如果没有 Unicode
u
标志与i
标志相结合(这将启用Unicode 大小写折叠), [^|]
取反字符 class 仅匹配/[\0-\x7B\x7D-�]/
匹配的内容(参见Unicode-aware regular expressions in ES2015 ),如您所见,这与星界字符不匹配。
If you can use ES6 you can use如果你可以使用 ES6,你可以使用
console.log( /\|[^|]$/ui.test("|") )
It appears to be a bug of the V8 JavaScript engine of Chromium according to the developers in the recent report and has been fixed which will be included in Chrome 106 onwards.根据开发人员在 最近的报告中,这似乎是 Chromium 的 V8 JavaScript 引擎的一个错误,并且已经修复,它将包含在 Chrome 106 中。
Firefox seems to implement its RegExp under some source code of V8, and will be waiting for the upstream fix. Firefox 似乎在V8 的一些源代码下实现了其RegExp,将等待上游修复。 ( ref )
(参考)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.