如何将此正则表达式拆分为更具可读性？

Question

and still keep it in the object literal: 并将其保留在对象文字中：

url:       /:\/{0,3}(www\.)?([0-9.\-A-Za-z]{1,253})([\x00-\x7F]{1,2000})$/,

In addition how can I simplify it. 此外，我如何简化它。

It is just a mess in the current state. 这只是当前状态的混乱。 I'm not worried about accuracy right now. 我现在不担心准确性。

Here is my try from Crockford's book: 这是我在Crockford的书中的尝试：

makeRegex: function () {
    var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})
                    ([0-9.\-A-Za-z]+)
                    (?::(\d+))
                    ?(?:\/([^?#]*))
                    ?(?:\?([^#]*))
                    ?(?:#(.*))?$/; 
},

Answer 1

Regular expressions are notoriously unreadable. 众所周知，正则表达式是不可读的。 They don't like extra spaces and they don't have comments. 他们不喜欢额外的空格，也没有评论。 Your only possible solution is to construct a string and then turn that into a regular expression. 您唯一可行的解决方案是构造一个字符串，然后将其转换为正则表达式。

Here are the steps I went trough 这是我走过的台阶

Target Regular Expression 目标正则表达式

var regex=/:\/{0,3}(www\.)?([0-9.\-A-Za-z]{1,253})([\x00-\x7F]{1,2000})$/;

Use RegExp to construct the expression from a string. 使用RegExp从字符串构造表达式。

var parse_url = RegExp(':/{0,3}(www\\.)?([0-9.\\-A-Za-z]{1,253})([\\x00-\\x7F]{1,2000})$');

Remember: 记得：

the / delimiters at the beginning and the end of the expression are not there — they're only in a RegEx literal 表达式开头和结尾的/分隔符不存在 - 它们只存在于RegEx文字中
the \\ characters in the string are doubled, because the string has its own interpretation of them 字符串中的\\字符加倍，因为字符串有自己的解释

Break the string up by adding '+' as strategic points: 通过添加'+'作为战略要点来打破字符串：

var parse_url = RegExp(':/{0,3}(www\\.)?'+'([0-9.\\-A-Za-z]{1,253})'+'([\\x00-\\x7F]{1,2000})$');

var parse_url = RegExp(':/{0,3}(www\\.)?'+
    '([0-9.\\-A-Za-z]{1,253})'+
    '([\\x00-\\x7F]{1,2000})$');

It's not a very good solution, but that's all you can do with a regular expression. 这不是一个非常好的解决方案，但是你可以用正则表达式做所有事情。

Modern JavaScript does support multi-line strings in the form of the template literals, but that probably won't help much here. 现代JavaScript确实支持模板文字形式的多行字符串，但这可能在这里没有多大帮助。

Answer 2

I suggest breaking a regular expression into parts and assigning each part to a well-named variable, with a comment if necessary. 我建议将正则表达式分成几部分，并将每个部分分配给一个命名良好的变量，必要时带注释。 An example, which is meant to demonstrate the principle rather than correctly validate URLs, since a URL-matching regex is hard to write ( https://mathiasbynens.be/demo/url-regex ): 一个示例，旨在演示原理而不是正确验证URL，因为URL匹配正则表达式很难编写（ https://mathiasbynens.be/demo/url-regex ）：

var protocol = '(?:https?|ftp)'; // Protocol can be "http", "https" or "ftp"
var domain = '([A-Za-z0-9\.]+)'; // Alphanumeric characters separated by periods
var path = '(?:[A-Za-z0-9\.\/]+)'; // Alphanumeric characters, . or /
var regexp = Regexp(protocol + '://' + domain + '/' + path);

Now you have the regular expression broken into smaller, more easily understood mini-expressions, and the overall expression is a lot easier to read. 现在，您将正则表达式分解为更小，更容易理解的迷你表达式，并且整体表达式更容易阅读。

如何将此正则表达式拆分为更具可读性？

问题描述

2 个解决方案

解决方案1
0 2017-04-11 22:56:35

解决方案2
0 2017-04-11 23:21:22

如何将此正则表达式拆分为更具可读性？

问题描述

2 个解决方案

解决方案1 0 2017-04-11 22:56:35

解决方案2 0 2017-04-11 23:21:22

解决方案1
0 2017-04-11 22:56:35

解决方案2
0 2017-04-11 23:21:22