How can I parse Linux terminal colour codes?

Question

I'm trying to display coloured text form a file in a curses python app. The file contains 24 bit colour escape codes such as: ^[[48;2;255;255;255m^[[38;2;255;255;255mA . I need to loop through the file and generate a list of tuples, containing tuples like so: ((background: r, g, b), (foreground: r, g, b), char) . Here is the code I have tried so far. It tries to find all the 0x27 bytes and parse them, however it does not seem to work, and not only that it is not particularly safe/scalable/maintainable. How can I parse the colour codes in a file like this? Is there a library? How can I improve this code, if there is no library?

def read_until(char, data, pos):
    """read from a bytes-like object starting at `pos` until the target character is found"""
    start = pos
    while data[pos] != char: pos += 1
    return data[start:pos], pos

def makeData(data):
    assert type(data) == bytes

    out = []
    pos = 0
    bg = (0, 0, 0)
    fg = (255, 255, 255)

    while pos < len(data):
        if data[pos] == 27: #start of the escape sequence
            pos += 2 #+2 to ignore the `[` char
            if data[pos:pos + 2] == b"0m": #reset character, just ignore it
                continue
            first_num, pos = read_until(ord(";"), data, pos)
            second_num, pos = read_until(ord(";"), data, pos + 1) #+ 1 for the `;` char

            r, pos = read_until(ord(";"), data, pos + 1)
            g, pos = read_until(ord(";"), data, pos + 1)
            b, pos = read_until(ord("m"), data, pos + 1)
            r = int(r)
            g = int(g)
            b = int(b)
            pos += 1 #skip last `m` char
            if first_num == b"48": #48 means foreground
                fg = (r, g, b)
            elif first_num == b"38": #38 means background
                bg = (r, g, b)
        else:
            out.append((fg, bg, chr(data[pos]))) #use current fg and bg colours with char
            pos += 1
    return out

with open("file.txt", "rb") as f:
    print("".join([chr(i[2]) for i in makeData(f.read())])) #print without colour codes, to test

Answer 1

You can use regular expression to parse the data:

import re
with open("file.txt") as f:
    result = re.findall(r'\x1b\[48;2;(\d+);(\d+);(\d+)m\x1b\[38;2;(\d+);(\d+);(\d+)m(.)', f.read())
    print(result)

How can I parse Linux terminal colour codes?

Question

1 answers

solution1
0 ACCPTED 2020-10-26 17:15:07

How can I parse Linux terminal colour codes?

Question

1 answers

solution1 0 ACCPTED 2020-10-26 17:15:07

solution1
0 ACCPTED 2020-10-26 17:15:07