简体   繁体   中英

Trying to find characters that do not resemble Hex in the format '\x0a'

I am parsing a string that contains file magic numbers but the formatting is inconsistent. Some of the patterns are in Hex with the format '\\x0a'(Where the string holds an escaped char so I apparently need to search for both \\'s), others are the direct ASCII characters and the rest are somewhere in between.

I was hoping to make a Regular Expression to find the characters in a string that are not already Hex. I attempted the following search for Hex values with the inversion flag.

(?!\\\\x[0-9 a-f]{2})

This did not work as intended as it sees the x in the next character after the full match and matches to that.

>>> test = "\\x50K\\x03\\x04"
>>> re.search("(?!\\\\x[0-9 a-f]{2})" test)
<re.Match object; span(1, 1), match=''>

Without getting the positive results and inverting them myself I am not sure how to proceed.

Thanks!

您可以使用以下内容替换十六进制值: re.sub(r'\\\\x[0-9 af]{2}','', your_line)并使用剩余的 - 非十六进制字符

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM