简体   繁体   English

正则表达式-忽略引号中的某些字符

[英]Regex - ignore certain characters in quotes

I tried searching for the answer to this, but I couldn't find anything too helpful in this situation. 我试图寻找答案,但是在这种情况下我找不到任何帮助。 It's possible that I'm not searching the correct terms. 我可能没有搜索正确的术语。

I'm having trouble with this regex. 我在使用此正则表达式时遇到了麻烦。 Consider this string: 考虑以下字符串:

$str = "(1, 2, 'test (foo) bar'), (3, 4, '(hello,world)')";

I want to end up with a multidimensional array, like this: 我想结束一个多维数组,像这样:

$arr = array(
   array(1, 2, 'test (foo) bar'),
   array(3, 4, '(hello,world)')
);

I figure I could run a regex to split it up separate strings like "(1, 2, 'test (foo) bar')" and "(3, 4, '(hello,world)')", and then run a regex on each of those to split by comma, but as you can see my problem is the data has parentheses and commas in various strings, and I'd like to ignore those. 我想我可以运行一个正则表达式将其拆分成单独的字符串,例如“(1,2,'test(foo)bar')“和”(3,4,'(hello,world)')“,然后运行一个正则表达式上的每个用逗号分隔,但是正如您所看到的,我的问题是数据在各个字符串中都有括号和逗号,我想忽略它们。

So far I have this, which does the first part like I wanted, except if there are parentheses in the data, then it breaks. 到目前为止,我已经有了它,它按照我的要求做了第一部分,除非数据中有括号,否则它会中断。

preg_match_all('/\((.*?)\),?/', $str, $matches);

It gives me this: 它给了我这个:

Array
(
    [0] => Array
        (
            [0] => (1, 2, 'test (foo)
            [1] => (3, 4, '(hello,world)
        )

    [1] => Array
        (
            [0] => 1, 2, 'test (foo
            [1] => 3, 4, '(hello,world
        )

)

It truncates the data, naturally. 它自然会截断数据。 What can I do to ignore the parentheses that are in quotes? 我该怎么做才能忽略引号中的括号? If I can ignore them, then on the next step when I split each of these matches, I'll be able to ignore commas. 如果我可以忽略它们,那么在下一步拆分这些匹配项时,我将能够忽略逗号。

Thanks! 谢谢!

Try this pattern: 试试这个模式:

$pattern = '/((?:.*?),(?:.*?),(?:.*?)),(.*)/';

this has the output 这有输出

Array
(
    [0] => Array
        (
            [0] => (1, 2, 'test (foo) bar'), (3, 4, '(hello,world)')
        )

    [1] => Array
        (
            [0] => (1, 2, 'test (foo) bar')
        )

    [2] => Array
        (
            [0] =>  (3, 4, '(hello,world)')
        )

)

In general, you cannot do that with regexes. 通常,您不能使用正则表达式执行此操作。 But in this case you can try this expression: 但是在这种情况下,您可以尝试以下表达式:

\(([^']*?'.*?')\),?

([0-9]+), (\\'([A-Za-z0-9(), ]+)\\')?

This appears to do what you want. 这似乎在做您想要的。

$matches Array:
(
[0] => Array
    (
        [0] => 1, 
        [1] => 2, 'test (foo) bar'
        [2] => 3, 
        [3] => 4, '(hello,world)'
    )

[1] => Array
    (
        [0] => 1
        [1] => 2
        [2] => 3
        [3] => 4
    )

[2] => Array
    (
        [0] => 
        [1] => 'test (foo) bar'
        [2] => 
        [3] => '(hello,world)'
    )

[3] => Array
    (
        [0] => 
        [1] => test (foo) bar
        [2] => 
        [3] => (hello,world)
    )
)

Is this closer? 离这近一点吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM