I have a chatlog that is as follows:
12-09-18 00:31:40 @966 [playerwithoutspaces] to TEAM: Hello all
12-09-18 00:32:11 @966 [playerswith[inname] to ALL: Helloall
12-09-18 00:30:15 @966 [player name with spaces] to ALL: Hello all]
I'm trying to get date, time, id,name, to, chat and content with re.split("""[\\s\\t](?![^[]*\\])""", line, 6)
But it doesn't quite work. The problem is, if the content
contains [ or ], it doesn't split the line properly.
So the result is:
['12-09-18', '00:30:15', '@966', '[player name with spaces] to ALL:\\tHello all]', '']
When it should be:
['12-09-18', '00:30:15', '@966', '[player name with spaces]', 'to', 'ALL:', '\\tHello all]']
I tried fiddling around with matching ] just certain amount of times, but that didn't work.
I forgot to mention that content is either preceded by a tab \\t or whitespace \\s, so it varies.
Here is the code as requested:
file = open("chatlog.txt", encoding="ANSI")
...
async def main():
for line in file.readlines():
await handle_chatlog_line(line)
async def handle_chatlog_line(line):
print(re.split("""[\s\t](?![^[]*\])""", line, 6))
date, time, ingame_client_id, client_name, irrelevant, chat, content = re.split("""[\s\t](?![^[]*\])""", line, 6)
And it crashes on the 3rd line in chatlog due to the regex being incorrect and therefore split not producing enough items.
我意识到在这种情况下拆分并不合适,所以我最终使用了 re.match:
match = re.match("(\d\d-\d\d-\d\d \d\d:\d\d:\d\d)\s+(@\d+) \[(.+)\] to (TEAM|ALL):\s+(.+)",line)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.