简体   繁体   中英

Can't concat bytes to str (Converting to Python3)

I'm trying to convert my Python 2 code to Python3 but I am receiving the following error:

Traceback (most recent call last):
  File "markovtest.py", line 73, in <module>
    get_all_tweets("quit_cryan")
  File "markovtest.py", line 41, in get_all_tweets
    outtweets = [(tweet.text.encode("utf-8") + str(b" ")) for tweet in alltweets]
  File "markovtest.py", line 41, in <listcomp>
    outtweets = [(tweet.text.encode("utf-8") + str(b" ")) for tweet in alltweets]
TypeError: can't concat bytes to str

The problem is in this for loop:

outtweets = [(tweet.text.encode("utf-8") + " ") for tweet in alltweets]

I have tried changing encode to decode or removing the encode parameter altogether but I cannot figure it out. Any help would be appreciated.

Python3 has several different 'string' types. Details on which ones there are and what they are supposed to do can be found here .

You are trying to combine a bytes string (basically an immutable character array) to a unicode string. This can not (easily) be done.

The problem in your code snippet is that the tweet text, most likely a string, is converted to bytes with the encode method. This works fine, but when you try to concatenate the space " " (which is a string) to the bytes object the error occurs. You can either remove the encode and do the concatenation as strings (and maybe encode later) or make the space a bytes object by adding a 'b' before the quotes like this b" " .

Let's take a look at your options:

In [1]: type("foo")
Out[1]: str

In [2]: type("foo".encode("utf-8"))
Out[2]: bytes

In [3]: "foo" + " "  # str + str
Out[3]: 'foo '

In [4]: "foo".encode("utf-8") + " "  # str + bytes
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-5c7b745d9739> in <module>()
----> 1 "foo".encode("utf-8") + " "

TypeError: can't concat bytes to str

I guess for you problem, the simplest solution would be to make the space a byte string (as below). I hope this helps.

In [5]: "foo".encode("utf-8") + b" "  # bytes + bytes
Out[5]: b'foo '

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM