简体   繁体   中英

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 6: invalid continuation byte

So this started when I upgraded my mint 19 to 20. The full error:

Traceback (most recent call last):
  File "/home/notification/views.py", line 206, in get
    .select_related("history__definition")
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 653, in first
    for obj in (self if self.ordered else self.order_by('pk'))[:1]:
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 274, in __iter__
    self._fetch_all()
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 1242, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/query.py", line 55, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1133, in execute_sql
    cursor.execute(sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py", line 192, in execute
    return self._record(self.cursor.execute, sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py", line 126, in _record
    return method(sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute
    return super().execute(sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/sentry_sdk/integrations/django/__init__.py", line 469, in execute
    return real_execute(self, sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/django/db/backends/mysql/base.py", line 73, in execute
    return self.cursor.execute(query, args)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute
    res = self._query(query)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 321, in _query
    self._post_get_result()
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 355, in _post_get_result
    self._rows = self._fetch_row(0)
  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/MySQLdb/cursors.py", line 328, in _fetch_row
    return self._result.fetch_row(size, self._fetch_type)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 11: invalid continuation byte

The database is MySQL and its configured with utf8mb4

mysql> SHOW VARIABLES LIKE 'char%';
Variable_name           |Value                     |
------------------------|--------------------------|
character_set_client    |utf8mb4                   |
character_set_connection|utf8mb4                   |
character_set_database  |utf8mb4                   |
character_set_filesystem|binary                    |
character_set_results   |                          |
character_set_server    |latin1                    |
character_set_system    |utf8                      |
character_sets_dir      |c:\mariadb\share\charsets\|

The row that is throwing this error is this one, and looking at the hex, its ok.

description                                 |hex(description)                                                                          |
--------------------------------------------|------------------------------------------------------------------------------------------|
Necessária para as partidas na 'batalha'|4E6563657373C3A17269612070617261206173207061727469646173206E612027626174616C6861206A6F7927|

á = C3 A1

Someone is putting the 0xe1, which belongs to cp1252. I went deep on the debug, but the conversion appears to be happening on the MysqlDB library.

More about the environment:

pip3 list | grep -i mysql
mysql-connector-python                8.0.20   
mysql-connector-python-rf             2.2.2    
mysqlclient                           2.0.1    
PyMySQL                               0.9.3

The character_set_% settings that you have seem strange:

| character_set_client     | latin7          | names
| character_set_connection | latin7          | names
| character_set_database   | utf8mb4         | ?
| character_set_filesystem | binary          | hands-off
| character_set_results    | latin7          | names
| character_set_server     | utf8mb4         | ?
| character_set_system     | utf8            | hands-off

I have labeled them in 3 groups:

  • "hands-off" -- filesystem and system should not be modified from the default, else internal things are likely to break.
  • "names" -- SET NAMES latin7 is, for example, how you specify that the clients are using latin7 encoding. The general move is away from the old default latin1 toward the future standard of utf8mb4 . (I used latin7 just to make it stand out.
  • "?" -- It is unclear what impact these two have. I recommend leaving them along from the values from installation, which is probably utf8mb4 (for both) in recent versions of MySQL/MariaDB.

In 5.7.6 GLOBAL character_set_database and collation_database system variables were deprecated; the SESSION version become readonly (deprecation)

From the 8.0.1 changelog:

Important Change: The default character set has changed from latin1 to utf8mb4. These system variables are affected:

  • The default value of the character_set_server and character_set_database system variables has changed from latin1 to utf8mb4.

  • The default value of the collation_server and collation_database system variables has changed from latin1_swedish_ci to utf8mb4_0900_ai_ci.

As a result, the default character set and collation for new objects differ from previously unless an explicit character set and collation are specified. This includes databases and objects within them, such as tables, views, and stored programs. One way to preserve the previous defaults is to start the server with these lines in the my.cnf file:

[mysqld]
character_set_server=latin1
collation_server=latin1_swedish_ci

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM