简体   繁体   English

javascript正则表达式解析数组语法字符串几乎可以正常工作

[英]javascript regex parsing array-syntax strings almost working

So I'm parsing strings (from a URL) that have an array-like syntax, such as: 因此,我正在解析具有类似数组语法的字符串(从URL),例如:

variable[foo]
variable[foo][bar]

I need EACH of the indexes (in square brackets) to be it's own capturing group, and I need it to work with one OR MORE indexes... My regex ALMOST works, but only captures the FINAL index, not the proceeding ones, so works perfect with one index. 我需要每个索引(在方括号中)都是它自己的捕获组,并且我需要它与一个或多个索引一起工作...我的正则表达式ALMOST可以工作,但仅捕获FINAL索引,而不捕获进行中的索引,因此与一个索引完美搭配。

here you can see my best attempt, and when you hover over the second example, you'll see that group_4 becomes captured group #2 and the rest are lost. 在这里,您可以看到我的最佳尝试,将鼠标悬停在第二个示例上时,您会看到group_4成为捕获的组#2,其余的丢失。 I need the captured groups to match the example names. 我需要捕获的组以匹配示例名称。

Just for good measure, here you can see my whole solution for parsing the regex results into actual javascript objects. 出于很好的考虑,在这里您可以看到我将正则表达式结果解析为实际javascript对象的整个解决方案。

getUrlParams: function() {
        let query = decodeURIComponent(window.location.search);

        let paramRegex = /[&?]([\w[\]\-%]+)=([\w[\]\-%/,\s]+)(?=&|$)/igm;
        let arrayRegex = /([\w]+)(?:(?:\[|%5B)([\w]+)(?:]|%5D))+/igm;

        let params = {};

        let match = paramRegex.exec(query);
        while (match !== null) {
            if (match && match[1]) {

                let array = arrayRegex.exec(match[1]);
                while(array !== null) {
                    if (array && array[1] && array[2]) {
                        console.log("ARRAY: ", array);
                         let deepParam = {};
                         deepParam[array[2]] = match[2];
                         if (array[1] in params) {
                             $.extend(params[array[1]], deepParam);
                         } else {
                             params[array[1]] = deepParam;
                         }
                    } else {
                        params[match[1]] = match[2];
                    }

                    array = arrayRegex.exec(match[1]);
                }
            }
            match = paramRegex.exec(query);
        }
        return params;
    },

This code works great with only one index, but once the regex captures multiple indexes, this code will have to handle it too. 该代码仅适用于一个索引,但是一旦正则表达式捕获了多个索引,该代码也必须处理它。

Any help is much appreciated. 任何帮助深表感谢。

UPDATE: 更新:

Here is my final function solution, based on bowheart's very elegant code. 这是我基于bowheart非常优雅的代码的最终功能解决方案。

    getUrlParams: function() {
    let query = decodeURIComponent(window.location.search);
    let paramRegex = /[&?]([\w[\]\-%]+)=([\w[\]\-%/,\s]+)(?=&|$)/igm;

    let params = {};

    let match = paramRegex.exec(query);
    while (match !== null) {
        if (match && match[1] && match[2]) {
            let key = match[1];
            let val = match[2];
            let arrayKeys = key.split(/\[|]/g).filter(node => node);
            populateObject(params, arrayKeys, val);

        }
        match = paramRegex.exec(query);
    }

    return params;

    function populateObject(obj, keys, val) {
        if (keys.length === 1) return obj[keys[0]] = (isNaN(+val) ? val : +val);
        let nextKey = keys.shift();
        if (!obj[nextKey]) obj[nextKey] = isNaN(+keys[0]) ? {} : [];

        populateObject(obj[nextKey], keys, val);
    }
},

Try this regex: 试试这个正则表达式:

(?:[\?|\&]([\w]+))|((?:\[|%5B)(\w+)(?:]|%5D))

It captures each group value as an independent match 它将每个组值捕获为一个独立的匹配项

What on earth gave you the idea to accomplish all this with two massive regular expressions? 到底是什么让您有了两个庞大的正则表达式来完成所有这些工作的想法? Just...Don't do that. 只是...不要那样做。 You'll probably live longer. 您的寿命可能会更长。 You will need regex to some degree, but always keep it as short as possible. 在某种程度上,您将需要正则表达式,但请始终使其尽可能短。

Here's a solution, if you're interested. 如果您有兴趣,这是一个解决方案。 You'll notice it's shorter, much easier to read, and accomplishes all the requirements: 您会发现它更短,更容易阅读,并且满足了所有要求:

 // Recursively populates nested objects/arrays. function populateObj(obj, keys, val) { if (keys.length === 1) return obj[keys[0]] = val let nextKey = keys.shift() if (!obj[nextKey]) obj[nextKey] = isNaN(+keys[0]) ? {} : [] populateObj(obj[nextKey], keys, val) } let params = {} let search = '?filters[name]=sa&filters[group_2][group_3][group_4]=4&order_bys[0][field]=name&order_bys=desc' search.slice(1).split('&').forEach(pair => { let [key, val] = pair.split('=') key = key.split(/\\[|]/g).filter(node => node) populateObj(params, key, val) }) // Just for display: document.body.innerHTML = JSON.stringify(params, null, ' &nbsp;').replace(/\\n/g, '<br>') 

The basic algorithm is: 基本算法是:

  • Split the GET params on '&' , then split each param into a key-val pair on '=' . '&'上分割GET参数,然后在'='上将每个参数分割成键-值对。

  • Regex out any square brackets in the keys to get all nodes for nested arrays/objects. 正则表达式中的所有方括号都可以获取嵌套数组/对象的所有节点。

  • Recursively traverse an object, creating child objects/arrays when necessary, and assign the given value to the last node. 递归遍历一个对象,在必要时创建子对象/数组,并将给定值分配给最后一个节点。

    • Create an array if the next key is numeric. 如果下一个键是数字,则创建一个数组。 Otherwise, create an object. 否则,创建一个对象。

(Note from your regexr snippet that order_bys[0][field]=name and order_bys=desc params are incompatible as one indicates that order_bys is a zero-indexed array and the other that it's a string. Not sure where you got that data...). (请注意,您的regexr代码段中order_bys[0][field]=nameorder_bys=desc参数不兼容,因为其中一个表示order_bys是零索引数组,另一个表示它是字符串。不确定从何处获取该数据。 ..)。

Split on square brackets and filter out empty strings: 在方括号上分割并过滤出空字符串:

"variable[foo][bar]".split(/\]|\[/).filter(s => !!s)
> [ "variable", "foo", "bar" ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM