Python: How to split string but preserve the non-alphanumeric characters

Question

I face problem when I am dealing with this:

Sample string - \"H\00E6tta\"

*\\00E6 is an unicode and my script able to understood it despite of not in usual form \æ. So please do not worry over that part.

I would expect after split something like:

['', '"H', "00E6tta", '"'] - first white column is normal as nothing before the '\' when I split

I did this:

sub_glyph = glyph.split("\\")

However this is the result I got:

['', 'H', '00E6tta', '']

Any clue? I would need the " to convert into unicode. But it just gone missing now. I am confused thought I split accordingly to '\\' and why the " will be gone. Can't find any resourceful guide online, need help.

Thanks

Answer 1

Use a raw string (prepending string with r makes it a raw string) and split it:

s = r'\"H\00E6tta\"'

print(s.split('\\'))
# ['', '"H', '00E6tta', '"']

Note : When we make s a raw string, the "literal" string (here) actually changes to \\\\"H\\\\00E6tta\\\\" (use repr(s) to view the change). This makes our split possible.

Python: How to split string but preserve the non-alphanumeric characters

Question

1 answers

solution1
2 ACCPTED 2018-08-15 05:27:08

Python: How to split string but preserve the non-alphanumeric characters

Question

1 answers

solution1 2 ACCPTED 2018-08-15 05:27:08

solution1
2 ACCPTED 2018-08-15 05:27:08