简体   繁体   English

使用正则表达式查找值(包括括号)

[英]Find values using regex (includes brackets)

it's my first time with regex and I have some issues, which hopefully you will help me find answers.这是我第一次使用正则表达式,我遇到了一些问题,希望你能帮助我找到答案。 Let's give an example of data:我们举一个数据的例子:

chartData.push({
date: newDate,
visits: 9710,
color: "#016b92",
description: "9710"
});
var newDate = new Date();
newDate.setFullYear(
2007,
10,
1 );

Want I want to retrieve is to get the date which is the last bracket and the corresponding description.我想要检索的是获取最后一个括号的日期和相应的描述。 I have no idea how to do it with one regex, thus I decided to split it into two.我不知道如何使用一个正则表达式来做到这一点,因此我决定将它一分为二。

First part:第一部分:

I retrieve the value after the description: .我在description: . This was managed with the following code: [\n\r].*description:\s*([^\n\r]*) The output gives me the result with a quote "9710" but I can fairly say that it's alright and no changes are required.这是使用以下代码管理的: [\n\r].*description:\s*([^\n\r]*) output 给我的结果是引用"9710" ,但我可以公平地说它是好的,无需更改。

Second part:第二部分:

Here it gets tricky.这里变得很棘手。 I want to retrieve the values in brackets after the text newDate.setFullYear .我想在文本newDate.setFullYear之后检索括号中的值。 Unfortunately, what I managed so far, is to only get values inside brackets.不幸的是,到目前为止我所做的只是获取括号内的值。 For that, I used the following code \(([^)]*)\) The result is that it picks all 3 brackets in the example:为此,我使用了以下代码\(([^)]*)\)结果是它选择了示例中的所有 3 个括号:

"{
date: newDate,
visits: 9710,
color: "#016b92",
description: "9710"
}",
"()",
"2007,
10,
1 "

What I am missing is an AND operator for REGEX with would allow me to construct a code allowing retrieval of data in brackets after the specific text .我缺少的是 REGEX 的 AND 运算符,它允许我构造一个代码,允许在特定文本之后检索括号中的数据

I could, of course, pick every 3rd result but unfortunately, it doesn't work for the whole dataset.当然,我可以选择每 3 个结果,但不幸的是,它不适用于整个数据集。

Does anyone of you know the way how to resolve the second part issue?你们中有人知道如何解决第二部分问题吗?

Thanks in advance.提前致谢。

You can use the following expression:您可以使用以下表达式:

res = re.search(r'description: "([^"]+)".*newDate.setFullYear\((.*)\);', text, re.DOTALL)

This will return a regex match object with two groups, that you can fetch using:这将返回具有两个组的正则表达式匹配 object,您可以使用以下方法获取:

res.groups()

The result is then:结果是:

('9710', '\n2007,\n10,\n1 ')

You can of course parse these groups in any way you want.您当然可以以任何您想要的方式解析这些组。 For example:例如:

date = res.groups()[1]
[s.strip() for s in date.split(",")]

==> 
['2007', '10', '1']

The AND part that you are referring to is not really an operator.您所指的 AND 部分并不是真正的运算符。 The pattern matches characters from left to right, so after capturing the values in group 1 you cold match all that comes before you want to capture your values in group 2.该模式从左到右匹配字符,因此在捕获第 1 组中的值之后,您可以冷匹配所有在您想要捕获第 2 组中的值之前出现的内容。

What you could do, is repeat matching all following lines that do not start with newDate.setFullYear(您可以做的是重复匹配以下所有不以newDate.setFullYear(

Then when you do encounter that value, match it and capture in group 2 matching all chars except parenthesis.然后,当您遇到该值时,匹配它并在第 2 组中捕获匹配除括号外的所有字符。

\r?\ndescription: "([^"]+)"(?:\r?\n(?!newDate\.setFullYear\().*)*\r?\nnewDate\.setFullYear\(([^()]+)\);

Regex demo |正则表达式演示| Python demo Python 演示

Example code示例代码

import re

regex = r"\r?\ndescription: \"([^\"]+)\"(?:\r?\n(?!newDate\.setFullYear\().*)*\r?\nnewDate\.setFullYear\(([^()]+)\);"

test_str = ("chartData.push({\n"
    "date: newDate,\n"
    "visits: 9710,\n"
    "color: \"#016b92\",\n"
    "description: \"9710\"\n"
    "});\n"
    "var newDate = new Date();\n"
    "newDate.setFullYear(\n"
    "2007,\n"
    "10,\n"
    "1 );")

print (re.findall(regex, test_str))

Output Output

[('9710', '\n2007,\n10,\n1 ')]

There is another option to get group 1 and the separate digits in group 2 using the Python regex PyPi module还有另一个选项可以使用 Python正则表达式 PyPi 模块获取第 1 组和第 2 组中的单独数字

(?:\r?\ndescription: "([^"]+)"(?:\r?\n(?!newDate\.setFullYear\().*)*\r?\nnewDate\.setFullYear\(|\G)\r?\n(\d+),?(?=[^()]*\);)

Regex demo正则表达式演示

import re

test = r"""
    chartData.push({
        date: 'newDate',
        visits: 9710,
        color: "#016b92",
        description: "9710"
    })
    var newDate = new Date()
    newDate.setFullYear(
        2007,
        10,
        1);"""

m = re.search(r".*newDate\.setFullYear(\(\n.*\n.*\n.*\));", test, re.DOTALL)


print(m.group(1).rstrip("\n").replace("\n", "").replace(" ", ""))

The result:结果:

(2007,10,1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM