正则表达式负向后看python

Question

I'm trying to split a string by commas that are not inside brackets (ie the string contains items that are separated by commas, but it also contains commas within brackets that I don't want to separate on). 我正在尝试用不在方括号内的逗号分割字符串（即，字符串包含用逗号分隔的项目，但它也包含在括号内的逗号，但我不想将其分开）。 Like so: 像这样：

A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'

Which should result in: 这应导致：

['[1, "A"]', ' [2, "B"]', ' [3, "C"]', ' [4, "D"]', ' [5, "E"]', ' [6, "F"]', ' [7, "G"]', ' [8, "H"]', ' [9, "I"]', ' [10, "J"]', '[100, "JJ"]']

I tried using negative lookbehind like this: 我尝试使用负向后看像这样：

B=re.split(r'(?<![[][\d]),',A)

However, this does not work when the number within the brackets goes above one digit such as in the case of [10, "J"]. 但是，当括号中的数字超过1位时，例如在[10，“ J”]的情况下，这将不起作用。 Any help would be greatly appreciated! 任何帮助将不胜感激！

Answer 1

This looks like "split on any comma that is preceded by a ] " could work. 这看起来像“上前面有一个逗号任何分裂]可以工作”。 For good measure I added \\s* to eat up the spaces before the next item. 为了更好地衡量，我加了\\s*来占用下一项之前的空格。

import re

A = '[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'

re.split(r"(?<=]),\s*", A)

gives 给

['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']

Answer 2

You can try this: 您可以尝试以下方法：

A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'
import re
data = re.split('(?<=\]),\s', A)

Output: 输出：

['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']

Answer 3

If using split is not a requirement, findall can also be used in with a very simple expression, 如果不要求使用split ， findall也可以使用非常简单的表达式来使用，

In [27]: re.findall(r'\[.+?\]', A)
Out[27]:
['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']

Answer 4

try this regex a get each item by group 1: 试试这个正则表达式按组1获取每个项目：

(\\[\\d+,\\s*\\"\\w+\\"\\])

You can see the result in this link: 您可以在以下链接中看到结果：

https://regex101.com/r/K5XV6F/1 https://regex101.com/r/K5XV6F/1

Answer 5

Using the newer regex module you can use 使用更新的regex模块，您可以使用

\[[^][]*\](*SKIP)(*FAIL) # discard anything in square brackets
|                        # or
,\s*                     # match , and whitespaces, eventually

In Python this looks like 在Python这看起来像

 import regex as re A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]' rx = re.compile(r'\\[[^][]*\\](*SKIP)(*FAIL)|,\\s*') print(rx.split(A)) # ['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']

See a demo on regex101.com . 参见regex101.com上的演示 。

正则表达式负向后看python

问题描述

5 个解决方案

解决方案1
1 2017-11-27 17:30:44

解决方案2
1 2017-11-27 17:31:05

解决方案3
0 2017-11-27 17:27:28

解决方案4
0 2017-11-27 17:56:45

解决方案5
0 2017-11-27 18:35:23

正则表达式负向后看python

问题描述

5 个解决方案

解决方案1 1 2017-11-27 17:30:44

解决方案2 1 2017-11-27 17:31:05

解决方案3 0 2017-11-27 17:27:28

解决方案4 0 2017-11-27 17:56:45

解决方案5 0 2017-11-27 18:35:23

解决方案1
1 2017-11-27 17:30:44

解决方案2
1 2017-11-27 17:31:05

解决方案3
0 2017-11-27 17:27:28

解决方案4
0 2017-11-27 17:56:45

解决方案5
0 2017-11-27 18:35:23