[英]finding regular expression literals in a string of javascript code
我正在用javascript粗略解析javascript代碼。 我就饒了為什么我需要這樣做的細節,但我只想說,我不想集成的代碼庫一大塊,因為它是不需要我的目的,我保持這種非常輕便是非常重要的並且相對簡單。 因此,請不要建議我使用JsLint或類似的東西。 如果答案是超出您可以粘貼到答案中的代碼,則可能超出我想要的范圍。
我的代碼目前能夠很好地檢測引用的節和注釋,然后匹配花括號,方括號和括號(請確保不要被引號和注釋所混淆,或者不要被引號引起來)。 這就是我需要做的所有事情,而且做得很好……除了一個例外:
可以將其與正則表達式文字混淆。 因此,我希望在檢測javascript字符串中的正則表達式文字方面有所幫助,以便我能夠適當地處理它們。
像這樣:
function getRegExpLiterals (stringOfJavascriptCode) {
var output = [];
// todo!
return output;
}
var jsString = "var regexp1 = /abcd/g, regexp1 = /efg/;"
console.log (getRegExpLiterals (jsString));
// should print:
// [{startIndex: 13, length: 7}, {startIndex: 32, length: 5}]
es5-lexer是一種JS詞法分析器,它使用非常精確的試探法將JS代碼中的正則表達式與除法表達式區分開來,並且還提供了令牌級別的轉換,您可以使用該轉換來確保生成的程序將被相同的方式解釋。與詞法分析器一樣的完整JS解析器。
確定/
是否啟動正則表達式的位在guess_is_regexp.js
,並且測試從scanner_test.js
第401行開始
var REGEXP_PRECEDER_TOKEN_RE = new RegExp(
"^(?:" // Match the whole tokens below
+ "break"
+ "|case"
+ "|continue"
+ "|delete"
+ "|do"
+ "|else"
+ "|finally"
+ "|in"
+ "|instanceof"
+ "|return"
+ "|throw"
+ "|try"
+ "|typeof"
+ "|void"
// Binary operators which cannot be followed by a division operator.
+ "|[+]" // Match + but not ++. += is handled below.
+ "|-" // Match - but not --. -= is handled below.
+ "|[.]" // Match . but not a number with a trailing decimal.
+ "|[/]" // Match /, but not a regexp. /= is handled below.
+ "|," // Second binary operand cannot start a division.
+ "|[*]" // Ditto binary operand.
+ ")$"
// Or match a token that ends with one of the characters below to match
// a variety of punctuation tokens.
// Some of the single char tokens could go above, but putting them below
// allows closure-compiler's regex optimizer to do a better job.
// The right column explains why the terminal character to the left can only
// precede a regexp.
+ "|["
+ "!" // ! prefix operator operand cannot start with a division
+ "%" // % second binary operand cannot start with a division
+ "&" // &, && ditto binary operand
+ "(" // ( expression cannot start with a division
+ ":" // : property value, labelled statement, and operand of ?:
// cannot start with a division
+ ";" // ; statement & for condition cannot start with division
+ "<" // <, <<, << ditto binary operand
// !=, !==, %=, &&=, &=, *=, +=, -=, /=, <<=, <=, =, ==, ===, >=, >>=, >>>=,
// ^=, |=, ||=
// All are binary operands (assignment ops or comparisons) whose right
// operand cannot start with a division operator
+ "="
+ ">" // >, >>, >>> ditto binary operand
+ "?" // ? expression in ?: cannot start with a division operator
+ "[" // [ first array value & key expression cannot start with
// a division
+ "^" // ^ ditto binary operand
+ "{" // { statement in block and object property key cannot start
// with a division
+ "|" // |, || ditto binary operand
+ "}" // } PROBLEMATIC: could be an object literal divided or
// a block. More likely to be start of a statement after
// a block which cannot start with a /.
+ "~" // ~ ditto binary operand
+ "]$"
// The exclusion of ++ and -- from the above is also problematic.
// Both are prefix and postfix operators.
// Given that there is rarely a good reason to increment a regular expression
// and good reason to have a post-increment operator as the left operand of
// a division (x++ / y) this pattern treats ++ and -- as division preceders.
);
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.