python正则表达式匹配字符串

Question

I want to parse a string, such as: 我想解析一个字符串，例如：

package: name='jp.tjkapp.droid1lwp' versionCode='2' versionName='1.1'
uses-permission:'android.permission.WRITE_APN_SETTINGS'
uses-permission:'android.permission.RECEIVE_BOOT_COMPLETED'
uses-permission:'android.permission.ACCESS_NETWORK_STATE'

I want to get: 我想得到：

string1: jp.tjkapp.droidllwp`

string2: 1.1

Because there are multiple uses-permission, I want to get permission as a list, contains: WRITE_APN_SETTINGS , RECEIVE_BOOT_COMPLETED and ACCESS_NETWORK_STATE . 由于有多种用途的许可，我要得到许可作为一个列表，包含： WRITE_APN_SETTINGS ， RECEIVE_BOOT_COMPLETED和ACCESS_NETWORK_STATE 。

Could you help me write the python regular expression to get the strings I want? 您能帮我写python正则表达式来获取我想要的字符串吗？ Thanks. 谢谢。

Answer 1

Assuming the code block you provided is one long string, here stored in a variable called input_string : 假设您提供的代码块是一个长字符串，这里存储在一个名为input_string的变量中：

name = re.search(r"(?<=name\=\')[\w\.]+?(?=\')", input_string).group(0)
versionName = re.search(r"(?<=versionName\=\')\d+?\.\d+?(?=\')", input_string).group(0)
permissions = re.findall(r'(?<=android\.permission\.)[A-Z_]+(?=\')', input_string)

Explanation: 说明：

name 名称

(?<=name\\=\\') : check ahead of the main string in order to return only strings that are preceded by name=' . (?<=name\\=\\') ：在主字符串之前检查，以仅返回以name='开头的字符串。 The \\ in front of = and ' serve to escape them so that the regex knows we're talking about the = string and not a regex command. =和'前面的\\用来使它们转义，以便正则表达式知道我们在谈论=字符串，而不是正则表达式命令。 name=' is not also returned when we get the result, we just know that the results we get are all preceded by it. 当我们得到结果时，也不会返回name=' ，我们只知道我们得到的结果都以它开头。
[\\w\\.]+? : This is the main string we're searching for. ：这是我们要搜索的主要字符串。 \\w means any alphanumeric character and underscore. \\w表示任何字母数字字符和下划线。 \\. is an escaped period, so the regex knows we mean . 是一个逃脱的时期，所以正则表达式知道我们的意思. and not the regex command represented by an unescaped period. 而不是用不转义的句号表示的regex命令。 Putting these in [] means we're okay with anything we've stuck in brackets, so we're saying that we'll accept any alphanumeric character, _ , or . 将它们放在[]意味着我们对放在方括号中的任何内容都可以接受，所以我们说我们将接受任何字母数字字符_或. . 。 + afterwords means at least one of the previous thing , meaning at least one (but possibly more) of [\\w\\.] . +后缀表示至少一个上一个事物 ，表示[\\w\\.]至少一个（但可能更多）。 Finally, the ? 最后， ? means don't be greedy --we're telling the regex to get the smallest possible group that meets these specifications, since + could go on for an unlimited number of repeats of anything matched by [\\w\\.] . 意味着不要贪心-我们告诉正则表达式获取满足这些规范的最小可能组，因为+可以无限次重复[\\w\\.]匹配的任何内容。
(?=\\') : check behind the main string in order to return only strings that are followed by ' . (?=\\') ：在主字符串后面检查，以便仅返回后跟'字符串。 The \\ is also an escape, since otherwise regex or Python's string execution might misinterpret ' . \\也是一个转义，因为否则正则表达式或Python的字符串执行可能会误解' 。 This final ' is not returned with our results, we just know that in the original string, it followed any result we do end up getting. 这个final ' 不随我们的结果一起返回，我们只知道在原始字符串中，它跟在我们最终得到的任何结果之后。

Answer 2

You can do this without regex by reading the file content line by line. 您可以在不使用正则表达式的情况下，通过逐行读取文件内容来执行此操作。

>>> def split_string(s):
...     if s.startswith('package'):
...             return [i.split('=')[1] for i in s.split() if "=" in i]
...     elif s.startswith('uses-permission'):
...             return s.split('.')[-1]
... 
>>> split_string("package: name='jp.tjkapp.droid1lwp' versionCode='2' versionName='1.1'")
["'jp.tjkapp.droid1lwp'", "'2'", "'1.1'"]
>>> split_string("uses-permission:'android.permission.WRITE_APN_SETTINGS'")
"WRITE_APN_SETTINGS'"
>>> split_string("uses-permission:'android.permission.RECEIVE_BOOT_COMPLETED'")
"RECEIVE_BOOT_COMPLETED'"
>>> split_string("uses-permission:'android.permission.ACCESS_NETWORK_STATE'")
"ACCESS_NETWORK_STATE'"
>>>

Answer 3

Here is one example code 这是一个示例代码

#!/usr/bin/env python
inputFile = open("test.txt", "r").readlines()
for line in inputFile:
    if line.startswith("package"):
        words = line.split()
        string1 = words[1].split("=")[1].replace("'","")
        string2 = words[3].split("=")[1].replace("'","")

test.txt file contains input data you mentioned earlier.. test.txt文件包含您之前提到的输入数据。

python正则表达式匹配字符串

问题描述

3 个解决方案

解决方案1
1 已采纳 2012-10-16 06:09:47

name 名称

解决方案2
0 2012-10-16 06:19:40

解决方案3
0 2012-10-16 21:13:33

python正则表达式匹配字符串

问题描述

3 个解决方案

解决方案1 1 已采纳 2012-10-16 06:09:47

name 名称

解决方案2 0 2012-10-16 06:19:40

解决方案3 0 2012-10-16 21:13:33

解决方案1
1 已采纳 2012-10-16 06:09:47

解决方案2
0 2012-10-16 06:19:40

解决方案3
0 2012-10-16 21:13:33