简体   繁体   English

String.split 使用正则表达式忽略方括号内的内容

[英]String.split ignore content inside square brackets with regex

I have a chatlog that is as follows:我有一个聊天记录,如下所示:

12-09-18 00:31:40   @966 [playerwithoutspaces] to TEAM: Hello all
12-09-18 00:32:11   @966 [playerswith[inname] to ALL:   Helloall
12-09-18 00:30:15   @966 [player name with spaces] to ALL:  Hello all]

I'm trying to get date, time, id,name, to, chat and content with re.split("""[\\s\\t](?![^[]*\\])""", line, 6) But it doesn't quite work.我正在尝试使用re.split("""[\\s\\t](?![^[]*\\])""", line, 6)但它并不完全有效。 The problem is, if the content contains [ or ], it doesn't split the line properly.问题是,如果content包含 [ 或 ],它不会正确拆分行。

So the result is:所以结果是:

['12-09-18', '00:30:15', '@966', '[player name with spaces] to ALL:\\tHello all]', '']

When it should be:什么时候应该:

['12-09-18', '00:30:15', '@966', '[player name with spaces]', 'to', 'ALL:', '\\tHello all]']

I tried fiddling around with matching ] just certain amount of times, but that didn't work.我试着摆弄匹配 ] 只是一定的次数,但没有奏效。

I forgot to mention that content is either preceded by a tab \\t or whitespace \\s, so it varies.我忘了提到内容之前是制表符 \\t 或空格 \\s,所以它会有所不同。

Here is the code as requested:这是要求的代码:

file = open("chatlog.txt", encoding="ANSI")
...
async def main():
    for line in file.readlines():
        await handle_chatlog_line(line)

async def handle_chatlog_line(line):
    print(re.split("""[\s\t](?![^[]*\])""", line, 6))
    date, time, ingame_client_id, client_name, irrelevant, chat, content = re.split("""[\s\t](?![^[]*\])""", line, 6)

And it crashes on the 3rd line in chatlog due to the regex being incorrect and therefore split not producing enough items.由于正则表达式不正确,它在聊天日志的第 3 行崩溃,因此拆分没有产生足够的项目。

我意识到在这种情况下拆分并不合适,所以我最终使用了 re.match:

match = re.match("(\d\d-\d\d-\d\d \d\d:\d\d:\d\d)\s+(@\d+) \[(.+)\] to (TEAM|ALL):\s+(.+)",line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM