[英]Python 3 - CSV and cx_Oracle
I'm having some serious trouble working with the csv and cx_oracle module. 我在使用csv和cx_oracle模块时遇到一些严重的麻烦。 I want to read a csv file, that is saved in UTF-8 (I checked it by saving it with Notepad in UTF-8).
我想读取保存在UTF-8中的csv文件(我通过使用记事本将其保存在UTF-8中进行了检查)。 I can read everything fine now (before I saved it as UTF-8 it didn't).
我现在可以阅读一切正常(在将其保存为UTF-8之前,还没有)。 This is my code to read the csv-file:
这是我读取csv文件的代码:
with open(file, 'rt', encoding='utf-8') as csvfile:
csvinput = csv.reader(csvfile, delimiter = ',', quotechar = '"')
for row in csvinput:
data.append(row)
This saves everything to a 2D array. 这会将所有内容保存到2D数组中。 Whenever I want to insert something into the database, I make a preparedstatement, and load the text into it as such:
每当我想在数据库中插入一些内容时,我都会准备一个preparedstatement,然后像这样将文本加载到其中:
data = [lastname, firstname]
cursor = cx_Oracle.Cursor(connection)
cursor.prepare("SELECT * FROM PRIVATE WHERE NAME = :1 AND FIRSTNAME = :2")
cursor.execute(None, data)
res = cursor.fetchall()
cursor.close()
It gives me tons of errors like: 它给了我很多错误,例如:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 1: ordinal not in range(128)
I tried reading up on the whole thing, but I got rather confused with the unicode thing as I don't really know where I should use what and why... Any help is appreciated. 我尝试阅读全部内容,但是我对unicode感到很困惑,因为我真的不知道我应该在哪里使用什么以及为什么使用...任何帮助都是值得的。 TLDR I get encoding errors whilst trying to execute prepared statements
TLDR我在尝试执行准备好的语句时遇到编码错误
You are trying to insert Unicode values into a VARCHAR2
column, which can only handle encoded byte strings. 您试图将Unicode值插入到
VARCHAR2
列中,该列只能处理编码的字节字符串。
cx_Oracle is trying to encode your Unicode values for you to fit the column type, and does so with the default codec for your connection. cx_Oracle尝试为您编码Unicode值以适合列类型,并使用用于连接的默认编解码器进行编码。
Either encode your values to a suitable encoding manually or make your columns use NVARCHAR2
instead. 可以手动将值编码为合适的编码,或者使列使用
NVARCHAR2
代替。
The latter has the added advantage that column lengths are expressed in characters , not bytes; 后者具有附加的优势,即列长度用字符而不是字节表示; UTF-8 data can use up to 4 bytes per character, so a
VARCHAR2(1000)
column could, in a worst-case scenario, fit only 250 actual characters. UTF-8数据每个字符最多可以使用4个字节,因此在最坏的情况下,
VARCHAR2(1000)
列只能容纳250个实际字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.