简体   繁体   English

使用Python中的SQLite3数据库处理文本

[英]Processing text from SQLite3 Database in Python

I have an SQLite3 database containing sentences of Japanese text and additional characters called furigana which help with the phonetic reading. 我有一个SQLite3数据库,其中包含日语文本的句子和称为假名的其他字符,这有助于语音阅读。

I have a function, remove_furigana, which can process a string and return the string without the furigana characters. 我有一个函数remove_furigana,它可以处理一个字符串并返回没有假名字符的字符串。 However, when I pass this function the sentences pulled from my database it doesn't seem to have any effect. 但是,当我通过这个函数时,从我的数据库中提取的句子似乎没有任何效果。 Could someone clarify for me what is going on here and point me in the direction of a solution? 有人可以告诉我这里发生了什么,并指出我的方向解决方案?

def remove_furigana(content):
    furigana = False
    expression = ""
    for character in content:
        if character == '[':
            furigana = True
        elif character == ']':
            furigana = False
        elif not furigana:
            expression += character
    return expression.replace(" ", "")

def retrieve_article():
    c.execute('SELECT content FROM sentence WHERE article_id = "k10010770581000"')
    for row in c.fetchall():
        print(remove_furigana(row))

Python SQLite fetchall function returns a tuple consisted of the fields in that record. Python SQLite fetchall函数返回一个由该记录中的字段组成的元组。 You need to send the content column to the function: 您需要将content列发送到该函数:

def retrieve_article():
    c.execute('SELECT content FROM sentence WHERE article_id = "k10010770581000"')
    for row in c.fetchall():
        print(remove_furigana(row[0]))

Alternatively, you can use row_factory to get dictionaries rather than tuples: 或者,您可以使用row_factory来获取字典而不是元组:

import sqlite3

def dict_factory(cursor, row):
    d = {}
    for idx, col in enumerate(cursor.description):
        d[col[0]] = row[idx]
    return d

con = sqlite3.connect(":memory:") con.row_factory = dict_factory

In that case, the fetchall result will be dictionary and you can access the content field as: 在这种情况下,fetchall结果将是字典,您可以访问content字段:

    print(remove_furigana(row['content']))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM