使用正则表达式搜索 Python 文件

Question

我有一个有很多行的文件。 每行以 {"id": 开头，后跟引号中的 ID 号。 （即{“id”：“106”）。 我正在尝试使用正则表达式逐行搜索整个文档并打印匹配 5 个不同 id 值的行。 为此，我创建了一个带有 id 的列表，并且只想遍历列表中以 {"id": "(id number from list)" 开头的匹配行。 我真的很困惑如何做到这一点。 这是我到目前为止所拥有的：

f= "bdata.txt"    
statids = ["85", "106", "140", "172" , "337"] 
x= re.findall('{"id":', statids, 'f')
for line in open(file):
            print(x)

我不断收到的错误代码是：TypeError: 不支持的操作数类型 &: 'str' 和 'int'

我需要整行进行匹配，以便我可以将其拆分并将其放入一个类中。

有什么建议吗？ 谢谢你的时间。

Answer 1

您可以使用正则表达式^\\{\\"id\\": \\"(\\d+)\\"从行中检索 id，其中 group#1 的值将为您提供 id。 然后，您可以检查statids是否存在该 id。

演示：

import re

statids = ["85", "106", "140", "172", "337"]

with open("bdata.txt") as file:
    for line in file:
        search = re.search('^\{\"id\": \"(\d+)\"', line)
        if search:
            id = search.group(1)
            if id in statids:
                print(line.rstrip())

对于文件中的以下示例内容：

{"id": "100" hello
{"id": "106" world
{"id": "2" hi
{"id": "85" bye
{"id": "10" ok
{"id": "140" good
{"id": "165" fine
{"id": "172" great
{"id": "337" morning
{"id": "16" evening

输出将是：

{"id": "106" world
{"id": "85" bye
{"id": "140" good
{"id": "172" great
{"id": "337" morning

Answer 2

我这里的问题是您使用 re.findall 的方式，根据文档，您必须将正则表达式作为第一个参数传递，并将要与表达式匹配的字符串作为第二个参数传递。 在您的情况下，我认为您应该这样做：

pattern = f'id: ({"|".join(statsids)})'
with open(f) as file:
  for line in file:
      match = re.findall(pattern, line)
      print(match.group(0))

在正则表达式中管道运算符“|” 通过将所有 id 作为字符串加入 | 在它们之间将找到它匹配一个或另一个 ID 的所有情况。 match.group 行返回找到它的位置。

使用正则表达式搜索 Python 文件

问题描述

2 个解决方案

解决方案1
0 2021-10-19 21:08:33

解决方案2
0 2021-10-19 21:17:39

使用正则表达式搜索 Python 文件

问题描述

2 个解决方案

解决方案1 0 2021-10-19 21:08:33

解决方案2 0 2021-10-19 21:17:39

解决方案1
0 2021-10-19 21:08:33

解决方案2
0 2021-10-19 21:17:39