简体   繁体   中英

Reading bytes from file python And converting to String

I have a file including some data like: \xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b How do i read this and write the string format(አማርኛ) in another file? And also vice versa?

[\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b == አማርኛ ]

That is a byte string, so you need to decode it to a utf-8 Unicode string.

b'\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b'.decode('utf8')  

result: 'አማርኛ'

And to encode it back to byte string:

'አማርኛ'.encode() 

result: b'\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b'

Basicly you have a byte string, you can do what you are talking about with the functions encode() and decode() respectively, in the example below, i will start by printing the byte string. And then i'm taking the byte string and decoding it to utf-8 (default value in all python versions above 2.7 if you don't specify a version yourself)

f = open("input.txt","rb")
x = f.read()
print(x) # b'\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b'
print(x.decode()) # አማርኛ

If you want to do the inverse operation, you can achieve this by just encoding back the decoded byte array. (Do note that the open function i'm using the arguments "rb" that stands for (following the wiki) "Opens a file for reading only in binary format."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM