简体   繁体   English

将JavaScript RegEx转换为JSON格式

[英]Convert a JavaScript RegEx into JSON format

I'm currently developing a Safari Extension which will make use of the new webkit-content-blocker feature available in Safari 9. Now, the rules of such blockers need to be written in JSON. 我目前正在开发一个Safari扩展程序,它将利用Safari 9中提供的新的webkit-content-blocker功能。现在,此类阻止程序的规则需要用JSON编写。

The background script of my soon-to-be extension generates such JSON rules. 我即将发布的扩展程序的后台脚本会生成此类JSON规则。 The issue I have is that I cannot properly format a regex, whose role is to filter URLs, to be JSON compatible. 我的问题是我无法正确格式化正则表达式(其作用是过滤URL)与JSON兼容。

Say I need to block all images whose URL contains either "banana", "orange", or "apple". 说我需要阻止所有URL包含“香蕉”,“橙色”或“苹果”的图像。 My regex would be something like: 我的正则表达式将是这样的:

var url-filter = /banana|orange|apple/g;

Now the blocker's rule in JSON, missing the url filtering part: 现在,JSON中的阻止程序规则缺少URL过滤部分:

"action": {
   "type": "block"
    },
"trigger": {
   "url-filter": <JSON regex here>,
   "resource-type": ["image"],
   "load-type": ["third-party"]
    }

[UPDATED] [更新]

How can I rewrite my regex to be JSON compatible/ready, knowing that alternations are not supported ? 知道不支持交替,如何将正则表达式重写为JSON兼容/就绪?

The Regular expression format 正则表达式格式

Triggers support filtering the URLs of each resource based on regular expression. 触发器支持基于正则表达式过滤每个资源的URL。

The following features are supported: 支持以下功能:

  • Matching any character with “.”. 将任何字符与“。”匹配。
  • Matching ranges with the range syntax [ab]. 使用范围语法[ab]匹配范围。
  • Quantifying expressions with “?”, “+” and “*”. 用“?”,“ +”和“ *”量化表达式。
  • Groups with parenthesis. 带括号的组。

It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. 可以使用行首(“ ^”)和行尾(“ $”)标记,但是它们被限制为表达式的第一个和最后一个字符。 For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error. 例如,类似“ ^ bar $”的模式是完全有效的,而“(foo)?^ bar $”会导致语法错误。

[UPDATED BIS] [更新的BIS]

Given the strict CSP policy implemented by Safari and the lack of support for alternations, I finally converted my original regex into an array and then dynamically generated the JSON rules via a loop. 鉴于Safari实施了严格的CSP策略,并且缺乏对替换的支持,我最终将原始正则表达式转换为数组,然后通过循环动态生成JSON规则。

var regex = 'banana|orange|apple',
    filters = regex.split('|'),
    json_rules = [];

var Blocker = {
        build: function() {

            filters.forEach( function(filter) {
                var rule = {
                    action: {
                        'type': 'block'
                    },
                    trigger: {
                        'url-filter': filter,
                        'resource-type': ['image'],
                        'load-type': ['third-party']
                    }
                };
                json_rules.push(rule);
            });

            Blocker.set(JSON.stringify(json_rules));
        },
        init: function() {
            Blocker.build();
        },
        set: function (rule) {
            safari.extension.setContentBlocker(rule);
        }

};

According to the documentation you linked, the values of the filters are treated as regular expressions (for example, they show "url-filter": "evil-tracker\\\\.js" and "url-filter": ".*" ). 根据您链接的文档,过滤器的值被视为正则表达式(例如,它们显示"url-filter": "evil-tracker\\\\.js""url-filter": ".*" ) 。

The documentation also says that url-filter is case-insensitive, so you don't have to worry about the i flag you might otherwise want to use. 该文档还说url-filter不区分大小写,因此您不必担心可能要使用的i标志。 But if you wanted a case-sensitive one, you'd add "url-filter-is-case-sensitive": true . 但是,如果要区分大小写,可以添加"url-filter-is-case-sensitive": true

That being the case, you just put your regular expression in quotes, being sure to escape any characters that need to be escaped within a string literal (for instance, note how they used two backslashes in their "evil-tracker\\\\.js" string, in order for the regex to be evil-tracker\\.js ). 在这种情况下,您只需将正则表达式放在引号中,请确保在字符串文字中转义所有需要转义的字符(例如,请注意它们在"evil-tracker\\\\.js"如何使用两个反斜杠)字符串,以使正则表达式成为evil-tracker\\.js )。

However : The problem with your expression is that they don't support alternations. 但是 :您表达的问题是它们不支持交替。 Again, from the documentation you linked: 同样,从您链接的文档中:

The format is a strict subset of JavaScript regular expressions. 该格式是JavaScript正则表达式的严格子集。 Syntactically, everything supported by JavaScript is reserved but only a subset will be accepted by the parser. 语法上,保留了JavaScript支持的所有内容,但解析器仅接受一个子集。 An unsupported expression results in a parse error. 不支持的表达式会导致解析错误。

The following features are supported: 支持以下功能:

  • Matching any character with “.”. 将任何字符与“。”匹配。
  • Matching ranges with the range syntax [ab]. 使用范围语法[ab]匹配范围。
  • Quantifying expressions with “?”, “+” and “*”. 用“?”,“ +”和“ *”量化表达式。
  • Groups with parenthesis. 带括号的组。

It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. 可以使用行首(“ ^”)和行尾(“ $”)标记,但是它们被限制为表达式的第一个和最后一个字符。 For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error. 例如,类似“ ^ bar $”的模式是完全有效的,而“(foo)?^ bar $”会导致语法错误。

Note that they don't accept | 请注意,他们不接受| (alternation). (替代)。

That tells me you'll need three rules: One for banana , one for orange , and one for apple . 这告诉我,您将需要三个规则:一个规则用于banana ,一个规则用于orange ,以及一个规则用于apple

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM