Get an UTF-16 string length from memory in python

Question

I need to read a utf-16 encoded string that is stored in memory in a python script for LLDB. According to their documentation I'm able to use ReadMemory(address, length, error) but I need to know its length in advance. If not python's decode function fails when it stumbles upon a character it cannot decode (even using the 'ignore' option) and the process stops:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u018e' in position 12: ordinal not in range(128)

Can anyone suggest a way of achieving this? (either using a "python" or "lldb python" implementation). I don't have the original string's length.

Thanks.

Answer 1

Is the string 0-terminated? If so, you could read 2 bytes at a time, until you encounter 0x0000, and then you'd know you have a complete string.

If you do this, you'd want to give yourself a constraint (eg "I will give up after reading - say - 1MB of data", in case you're running into corrupted memory).

Get an UTF-16 string length from memory in python

Question

1 answers

solution1
2 2016-02-19 02:13:39

Get an UTF-16 string length from memory in python

Question

1 answers

solution1 2 2016-02-19 02:13:39

solution1
2 2016-02-19 02:13:39