简体   繁体   中英

How can i compare unicode strings

I have the following and i want to test for equality -

id_0 = 40
id_1 = 48
id_2 = 49
id_3 = 41
id_4 = 0

conc_value = chr(id_0)+chr(id_1)+chr(id_2)+chr(id_3)+chr(id_4)

if conc_value == '(01)':
    print('Match')
else:
    print('Mismatch')

I always get a mismatch. How can i compare the 2 values?

Python strings aren't NUL-terminated. Don't add chr(id_4) to the string, and this works. Or change the literal being compared to to '(01)\0' (though it would be very unusual to want a NUL "terminator" in a string that doesn't use them).

If your data is coming from an external system (we'll imagine a function get_ordinal that returns the next Unicode ordinal from the external system by some means, and raises a custom NoMoreData exception when there's nothing left), then just filter the values as they come in:

ordinals = []  # List to store ordinals in
bad_ordinals = frozenset(map(ord, '\x00ÿ'))  # Make set of invalid ordinals for cheap exclusion test
try:
    while True:
        ordinal = get_ordinal()
        if ordinal not in bad_ordinals:
            ordinals.append(ordinal)  # Only append if not on blocklist
except NoMoreData:
    pass
conc_data = ''.join(map(chr, ordinals))  # Bulk convert to characters then efficiently
                                         # glue together final string
id_0 = 40
id_1 = 48
id_2 = 49
id_3 = 41

conc_value = chr(id_0)+chr(id_1)+chr(id_2)+chr(id_3)+chr(id_4)

if conc_value.replace("\0", "") == '(01)':
    print('Match')
else:
    print('Mismatch')

#Match

This because you do not need the null char to end strings in python

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM