Python 中的正则表达式前瞻和后瞻多次

Question

I have the input formatted as below (txt1):我的输入格式如下（txt1）：

txt1 = "[('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals'), ('a')]"

I want to extract it in the following format-我想以以下格式提取它-

[['1','Hello is 1)people 2)animals'],['People are 1) hello 2) animals'],['a']]

So, basically, I want the information within the parentheses.所以，基本上，我想要括号内的信息。 But I haven't been able to do that.但我无法做到这一点。 Also, I have used the Lookahead and Lookbehind to avoid splitting by the numbers- '1)' or '2)' which happened earlier when I went a simple statement of re.split('[\(\)\[\]]此外，我使用了 Lookahead 和 Lookbehind 来避免被数字拆分 - '1)' 或 '2)' 之前我使用re.split('[\(\)\[\]]的简单语句时发生的情况

I have been trying a findall function first to check what I am getting.我一直在尝试findall function 首先检查我得到了什么。

r = re.findall(r'\((?=\').*(?<=\')\)(?=\,)', txt1)

I have been getting-我一直在——

["('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals')"]

It seems like it is ignoring the middle parenthesis.似乎它忽略了中间括号。 What can I do to get the result that I need?我该怎么做才能得到我需要的结果？

Thank you.谢谢你。

Note:笔记：

For the split function, which I intend to use to get the desired output, I am getting this-对于拆分 function，我打算用它来获得所需的 output，我得到了这个 -

r = re.split(r'\((?=\').*(?<=\')\)(?=\,)', txt1)

['[', ", ('a')]"]

Answer 1

Why regex?为什么是正则表达式？

import ast
[list(x) if isinstance(x, tuple) else [x] for x in ast.literal_eval(txt1)]
# => [['1', 'Hello is 1)people 2)animals'], ['People are 1) hello 2) animals'], ['a']]

If you insist on regular expressions, this should work unless the strings contain escaped quotes:如果您坚持使用正则表达式，除非字符串包含转义引号，否则这应该有效：

[re.findall(r"'[^']*'", x) for x in re.findall(r"\(('[^']*'(?:,\s*'[^']*')*)\)", txt1)]
# => [["'1'", "'Hello is 1)people 2)animals'"], ["'People are 1) hello 2) animals'"], ["'a'"]]

Answer 2

Another solution without having to use regex :无需使用regex的另一种解决方案：

txt1 = "[('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals'), ('a')]"
replace_pairs = {
    "('": "'",
    "'), ": '#',
    '[': '',
    ']': '',
    "'": '',
}
for k, v in replace_pairs.items():
    txt1 = txt1.replace(k, v)

txt1 = txt1[:-1].split('#') # the last char is a paranthesis
print([i.split(',') for i in txt1])

Output: Output：

[['1', 'Hello is 1)people 2)animals'], ['People are 1) hello 2) animals'], ['a']]

Note: This may not work if the input is more complicated than what you've shown here.注意：如果输入比您在此处显示的更复杂，这可能不起作用。

Answer 3

You could try with pattern \(([^(]+)\)您可以尝试使用模式\(([^(]+)\)

Explanation:解释：

\( - match ( literally \( - 匹配(字面意思

(...) - capturing group (...) - 捕获组

[^(]+ - match one or more characters other from ( [^(]+ - 匹配除(

\) - match ) literally \) - 匹配)字面意思

And use replace pattern: [\1] , which puts first capturing group (backreference \1 ) inside square brackets.并使用替换模式： [\1] ，它将第一个捕获组（反向引用\1 ）放在方括号内。

Demo演示

Python 中的正则表达式前瞻和后瞻多次

问题描述

3 个解决方案

解决方案1
0 2019-10-24 04:56:27

解决方案2
0 2019-10-24 05:00:16

解决方案3
0 2019-10-24 06:19:07

Python 中的正则表达式前瞻和后瞻多次

问题描述

3 个解决方案

解决方案1 0 2019-10-24 04:56:27

解决方案2 0 2019-10-24 05:00:16

解决方案3 0 2019-10-24 06:19:07

解决方案1
0 2019-10-24 04:56:27

解决方案2
0 2019-10-24 05:00:16

解决方案3
0 2019-10-24 06:19:07