简体   繁体   中英

decode/encode error django python

I have a list of keywords
keywords = [u'encendió', u'polémica']

I am trying to load them to a django model:

class myKeywords(model.Model):
    keyword = models.charField()

    def __unicode__(self):
        return self.keyword.encode('utf-8')

This is what i am trying:

for k in keywords:
    keyObj, created = myKeywords.objects.get_or_create(keyword=k.decode('utf-8'))
    print created, keyObj

However, I keep getting the django.utils.encoding.DjangoUnicodeDecodeError: 'ascii' codec can't decode byte .

I have tried:

  1. adding/removing u from infront of the keyword
  2. removing decode('utf-8') while creating the keyword object -- doing this successfully creates and saves the object if there is a u appended infront of the keyword
  3. removing encode('utf-8') from the __unicode__(self) function. -- doing this successfully prints the keyword

So, the only configuration that is working is as follows:

  1. keep u appended in-front of the keyword
  2. dont do decode('utf-8') or encode('utf-8') anyplace else

But I am not sure if this is the right way of doing this. Ideally I should be reading a keyword and decoding it as utf-8 and then be saving it to the db. Any suggestions?

The __unicode__ method should return a unicode string, not a byte string. Therefore you should remove the encode() from your __unicode__ method.

If your keywords have the u'' prefix, then they are unicode strings as well, and don't have to be decoded either.

You don't need to encode() the strings to utf-8 in __unicode__() method as Django returns all the strings from the database as unicode .

From docs,

Because all strings are returned from the database as Unicode strings, model fields that are character based (CharField, TextField, URLField, etc) will contain Unicode values when Django retrieves data from the database. This is always the case, even if the data could fit into an ASCII bytestring.

Since your keywords are already unicode strings(as prefixed by 'u' ), you don't need to do decode() while printing. Remove the decode() also.

Your code should look like:

models.py

class myKeywords(model.Model):
    keyword = models.charField()

    def __unicode__(self):
        return u'%s'%(self.keyword)


keywords = [u'encendió', u'polémica']
for k in keywords:
    keyObj, created = myKeywords.objects.get_or_create(keyword=k)
    print created, keyObj

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM