简体   繁体   中英

Choose encoding when converting to Sqlite database

I am converting Mbox files to Sqlite db. I do not arrive to encode the db file into utf-8.

The Python console displays the following message when converting to db:

Error binding parameter 1 - probably unsupported type.

When I visualize my data on DB Browser for SQlite, special characters don't appear and the � symbol shows up instead.

I first convert.text files to Mbox files with the following function:

def makeMBox(fIn,fOut):
    if not os.path.exists(fIn):
        return False
    if os.path.exists(fOut):
        return False

    out = open(fOut,"w")

    lineNum = 0

    # detect encoding
    readsource =  open(fIn,'rt').__next__
    #fInCodec = tokenize.detect_encoding(readsource)[0]
    fInCodec = 'UTF-8'
    
    for line in open(fIn,'rt', encoding=fInCodec, errors="replace"):
        if line.find("From ") == 0:
            if lineNum != 0:
                out.write("\n")
            lineNum +=1
            line = line.replace(" at ", "@")
        out.write(line)
        
            
    out.close()
    return True

Then, I convert to sqlite db:

for k in dates:

    db = sqlite_utils.Database("Courriels_Sqlite/Echanges_Discussion.db")
    mbox = mailbox.mbox("Courriels_MBox/"+k+".mbox")

    def to_insert():
        for message in mbox.values():
            Dionyversite = dict(message.items())
            Dionyversite["payload"] = message.get_payload()
            yield Dionyversite

    try:
        db["Dionyversite"].upsert_all(to_insert(), alter = True, pk = "Message-ID")
    except sql.InterfaceError as e:
        print(e)

Thank you for your help.

I found how to fix it:

def to_insert():
        for message in mbox.values():
            Dionyversite = dict(message.items())
            Dionyversite["payload"] = message.get_payload(decode = True)
            yield Dionyversite
``

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM