简体   繁体   English

JavaScript正则表达式拆分与符号分隔的字符串

[英]JavaScript Regular Expression to Split an Ampersand Delimited String

I've been at this for hours, and I'm hitting a dead end. 我已经在这里呆了几个小时了,我快要死了。 I've read up on Regular Expressions all over the place, but I'm still having trouble matching on anything more complex than basic patterns. 我到处都读过正则表达式,但是在比基本模式更复杂的事物上匹配时仍然遇到困难。

So, my problem is this: 所以,我的问题是这样的:

I need to split an "&" delimited to string into a list of objects, but I need to account for the values containing the ampersand as well. 我需要将以字符串分隔的“&”分割为对象列表,但是我也需要考虑包含“与”号的值。

Please let me know if you can provide any help. 如果您可以提供任何帮助,请告诉我。

var subjectA = 'myTestKey=this is my test data & such&myOtherKey=this is the other value';

Update: 更新:

Alright, to begin with, thanks for the awesome, thoughtful responses. 首先,感谢您的出色,周到的答复。 To give a little background on why I'm doing this, it's to create a cookie utility in JavaScript that is a little more intelligent and supports keys ala ASP. 为了让我有一个背景,我打算在JavaScript中创建一个cookie实用程序,该实用程序更加智能并且支持ala ASP。

With that being said, I'm finding that the following RegExp /([^&=\\s]+)=(([^&]*)(&[^&=\\s]*)*)(&|$)/g does 99% of what I need it to. 话虽如此,我发现下面的RegExp /([^&=\\s]+)=(([^&]*)(&[^&=\\s]*)*)(&|$)/g了我需要的99%。 I changed the RegExp suggested by the contributors below to also ignore empty spaces. 我更改了以下贡献者建议的RegExp,也忽略了空格。 This allowed me to turn the string above into the following collection: 这使我可以将上面的字符串转换为以下集合:

[
    [myTestKey, this is my test data & such],
    [myOtherKey, this is the other value]]
]

It even works in some more extreme examples, allowing me to turn a string like: 它甚至可以在一些更极端的示例中使用,使我可以像这样输入字符串:

var subjectB = 'thisstuff===myv=alue me==& other things=&thatstuff=my other value too';

Into: 进入:

[
    [thisstuff, ==myv=alue me==& other things=],
    [thatstuff, my other value too]
]

However, when you take a string like: 但是,当您使用类似以下的字符串时:

var subjectC = 'me===regexs are hard for &me&you=&you=nah, not really you\\'re just a n00b';

Everything gets out of whack again. 一切再次摆脱困境。 I understand why this is happening as a result of the regular expression above (kudos for a very awesome explanation), but I'm (obviously) not comfortable enough with regular expressions to figure out a work around. 我理解为什么上述正则表达式会导致这种情况的发生(对一个非常棒的解释表示敬意),但是我(显然)对正则表达式不满意,无法解决。

As far as importance goes, I need this cookie utility to be able to read and write cookies that can be understood by ASP and ASP.NET & vice versa. 就重要性而言,我需要此cookie实用程序能够读取和写入ASP和ASP.NET可以理解的cookie,反之亦然。 From playing with the example above, I'm thinking that we've taken it as far as we can, but if I'm wrong, any additional input would be greatly appreciated. 通过处理上面的示例,我认为我们已经尽力了,但是如果我错了,任何其他输入将不胜感激。

tl;dr - Almost there, but is it possible to account for outliers like subjectC ? tl; subjectC差不多在那里,但是有可能考虑到诸如subjectC异常值吗?

var subjectC = 'me===regexs are hard for &me&you=&you=nah, not really you\\'re just a n00b';

Actual ouput: 实际输出:

[
    [me, ==regexs are hard for &me],
    [you, ],
    [you, nah, not really you\'re just a n00b]
]

Versus expected output: 与预期输出:

[
    [me, ==regexs are hard for &me&you=],
    [you, nah, not really you\'re just a n00b]
]

Thanks again for all of your help. 再次感谢您的所有帮助。 Also, I'm actually getting better with RegExp... Crazy. 另外,我实际上对RegExp感到越来越好……疯狂。

If your keys cannot contain ampersands, then it's possible: 如果您的密钥不能包含&符,则可能是:

var myregexp = /([^&=]+)=(.*?)(?=&[^&=]+=|$)/g;
var match = myregexp.exec(subject);
while (match != null) {
    key = match[1];
    value = match[2];
    // Do something with key and value
    match = myregexp.exec(subject);
}

Explanation: 说明:

(        # Match and capture in group number 1:
 [^&=]+  # One or more characters except ampersands or equals signs
)        # End of group 1
=        # Match an equals sign
(        # Match and capture in group number 2:
 .*?     # Any number of characters (as few as possible)
)        # End of group 2
(?=      # Assert that the following can be matched here:
 &       # Either an ampersand,
 [^&=]+  # followed by a key (as above),
 =       # followed by an equals sign
|        # or
 $       # the end of the string.
)        # End of lookahead.

This may not be the most efficient way to do this (because of the lookahead assertion that needs to be checked several times during each match), but it's rather straightforward. 这可能不是执行此操作的最有效方法(因为在每次比赛中都需要多次检查前瞻性断言),但它非常简单。

I need to split an " & " delimited to string into a list of objects, but I need to account for the values containing the ampersand as well. 我需要将以字符串分隔的“ & ”分割为对象列表,但是我也需要考虑包含&符号的值。

You can't. 你不能

Any data format that allows a character to appear both as a special character and as data needs a rule (usually a different way to express the character as data) to allow the two to be differentiated. 任何允许字符既作为特殊字符又作为数据出现的数据格式都需要一个规则(通常是将字符表示为数据的另一种方式),以使两者有所区别。

  • HTML has & and & HTML具有&&
  • URIs have & and %26 URI具有&%26
  • CSV has " and "" CSV包含"""
  • Most programming languages have " and \\" 大多数编程语言都带有"\\"

Your string doesn't have any rules to determine if an & is a delimiter or an ampersand, so you can't write code that can tell the difference. 您的字符串没有任何规则来确定&是定界符还是&符,因此您无法编写能够说明差异的代码。

True, rules to differentiate are recommended, and true, a RegExp pattern may fail if a key contains the ampersand -or the equal!- symbol, but it can be done with plain JavaScript. 的确,建议使用区分规则,并且,如果键包含“&”号(或等号!)符号,则RegExp模式可能会失败,但是可以使用纯JavaScript来完成。 You just have to think in terms of key-value pairs, and live with the fact that there might not be a RegExp pattern to solve the problem: you will have to split the string into an array, loop through the elements and merge them if necessary: 您只需要考虑键值对,并忍受可能没有RegExp模式来解决问题的事实:您将不得不将字符串拆分成数组,遍历元素并将它们合并。必要:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
    <head>
        <style id="styleTag" type="text/css">
        </style>
        <script type="text/javascript">
        window.onload = function()
        {
            // test data
            var s = "myTestKey=this is my test data & such&myOtherKey=this is the other value&aThirdKey=Hello=Hi&How are you&FourthKey=that's it!";

            // the split is on the ampersand symbol!
            var a = s.split(/&/);

            // loop through &-separated values; we skip the 1st element
            // because we may need to address the previous (i-1) element
            // in our loop (you are REALLY out of luck if a[0] is not a
            // key=value pair!)
            for (var i = 1; i < a.length; i++)
            {
                // the abscence of the equal symbol indicates that this element is
                // part of the value of the previous key=value pair, so merge them
                if (a[i].search(/=/) == -1)
                    a.splice(i - 1, 2, a[i - 1] + '&' + a[i]);
            }

            Data.innerHTML = s;
            Result.innerHTML = a.join('<br/>');
        }
        </script>
    </head>
    <body>
        <h1>Hello, world.</h1>
        <p>Test string:</p>
        <p id=Data></p>
        <p>Split/Splice Result:</p>
        <p id=Result></p>
    </body>
</html>

The output: 输出:

Hello, world. 你好,世界。

Test string: 测试字符串:

myTestKey=this is my test data & such&myOtherKey=this is the other value&aThirdKey=Hello=Hi&How are you&FourthKey=that's it! myTestKey =这是我的测试数据&such&myOtherKey =这是另一个值&aThirdKey = Hello =嗨&你好吗&FourthKey =就是这样!

Split/Splice Result: 拆分/拼接结果:

myTestKey=this is my test data & such myTestKey =这是我的测试数据等
myOtherKey=this is the other value myOtherKey =这是另一个值
aThirdKey=Hello=Hi&How are you aThirdKey = Hello =嗨,你好吗
FourthKey=that's it! FourthKey =就是这样!

"myTestKey=this is my test data & such&myOtherKey=this is the other value".split(/&?([a-z]+)=/gi)

This returns: 返回:

["", "myTestKey", "this is my test data & such", "myOtherKey", "this is the other value"]

But if this is my test data & such would also contain an = sign, like this is my test data &such= something else , you're out of luck. 但是,如果this is my test data & such还会包含=符号,就像this is my test data &such= something else ,那么您就不走运了。

I suggest you to use 我建议你用

.split(/(?:=|&(?=[^&]*=))/);

Check this demo . 检查这个演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM