简体   繁体   English

JavaScript regex匹配引号内的字符,而不是字符集中的字符

[英]JavaScript regex match characters inside quotes and not in character set

I have a string I would like to split using #, ., [], or {} characters, as in CSS. 我有一个字符串,希望使用#, ., [], or {}字符进行拆分,就像CSS中一样。 The desired functionality is: 所需的功能是:

- Input: "div#foo[bar='value'].baz{text}" -输入: "div#foo[bar='value'].baz{text}"
- Output: ["div", "#foo", "[bar='value'", ".baz", "{text"] -输出: ["div", "#foo", "[bar='value'", ".baz", "{text"]

This is easy enough, with this RegEx: input.match(/([#.\\[{]|^.*?)[^#.\\[{\\]}]*/g) 使用此RegEx,这很容易: input.match(/([#.\\[{]|^.*?)[^#.\\[{\\]}]*/g)

However, this doesn't ignore syntax characters inside quotes, as I would like it too. 但是,这也不会忽略引号内的语法字符,我也希望这样做。 (ex "div[bar='value.baz']" should ignore the . ) (例如"div[bar='value.baz']"应该忽略.

How can I make the second part of my RegEx (the [^#.\\[{\\]}]* portion) capture not only the negated character set, but also any character within quotes. 如何使RegEx的第二部分( [^#.\\[{\\]}]*部分)不仅捕获否定的字符集,还捕获引号内的任何字符。 In other words, how can I implement the RegEx, (\\"|').+?\\1 into my current one. 换句话说,如何在我当前的RegEx中实现RegEx (\\"|').+?\\1

Edit: I've figured out a regex that works decent, but can't handle escaped-quotes inside quotes (for example: "stuff here \\\\" quote " ). If someone knows how to do that, it would be extremely helpful: 编辑:我已经找到了一个正常的正则表达式,但是不能处理引号内的转义引号(例如: "stuff here \\\\" quote " )。如果有人知道该怎么做,那将非常有帮助:

str.match(/([#.\\[{]|^.*?)((['"]).*?\\3|[^.#\\[\\]{\\}])*/g);

var tokens = myCssString.match(/\/\*[\s\S]*?\*\/|"(?:[^"\\]|\\[\s\S]*)"|'(?:[^'\\]|\\[\s\S])*'|[\{\}:;\(\)\[\]./#]|\s+|[^\s\{\}:;\(\)\[\]./'"#]+/g);

Given your string, it produces 给定您的字符串,它将产生

div
#
foo
[
bar=
'value.foo'
]
.
baz
{
text
}

The RegExp above is loosely based on the CSS 2.1 lexical grammar 上面的RegExp大致基于CSS 2.1词汇语法

var str = "div#foo[bar='value.baz'].baz{text}";
str.match(/(^|[\.#[\]{}])(([^'\.#[\]{}]+)('[^']*')?)+/g)
// [ 'div', '#foo', '[bar=\'value.baz\'', '.baz', '{text' ]

Firstly, and i can't stress this enough: you shouldn't use regexps to parse css, you should use a real parser, for instance http://glazman.org/JSCSSP/ or similar - many have built them, no need for you to reinvent the wheel. 首先,我对此的压力还不够大:您不应该使用正则表达式来解析CSS,而应该使用真正的解析器,例如http://glazman.org/JSCSSP/或类似的文件-许多已经构建了它们,不需要让您重新发明轮子。

that said, to solve your current problem do this: 也就是说,要解决您当前的问题,请执行以下操作:

var str = "div#foo[bar='value.foo'].baz{text}";

str.match(/([#.\[{]|^.*?)(?:[^#\[{\]}]*|\.*)/g);

//["div", "#foo", "[bar='value.foo'", ".baz", "{text"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM