简体   繁体   English

将 panda df 移至 teradata 表:[HY000] [Teradata][ODBC Teradata Driver][Teradata Database] 时间戳无效

[英]Move panda df to teradata table: [HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Invalid timestamp

I have a df that i want to move to a teradata table.我有一个要移动到 teradata 表的 df。 I am using a framework that was discussed on this platform.我正在使用在这个平台上讨论过的框架。 However I am getting a few errors.但是我遇到了一些错误。 The entire logic behind loading the df to teradata is:将 df 加载到 teradata 背后的整个逻辑是:

1) If table doesnt exist then create table else skip creation. 1)如果表不存在,则创建表,否则跳过创建。

2) Start loading the df to the table. 2)开始将df加载到表中。 (Note i will be passing multiple xlsx files to a df and eventually appending it to the teradata table) (注意我会将多个 xlsx 文件传递给 df 并最终将其附加到 teradata 表中)

I have written a bteq script to create a table which goes like this:我编写了一个 bteq 脚本来创建一个如下所示的表:

    FROM DBC.TABLES WHERE DATABASENAME = 'abc' AND TABLENAME = 'sample';

.IF ACTIVITYCOUNT <> 0 THEN .GOTO SKIP_CREATION
.IF ACTIVITYCOUNT = 0 THEN .GOTO TABLE_NOT_EXISTS

.LABEL TABLE_NOT_EXISTS 
CREATE TABLE abc.sample ( 
col1 VARCHAR(400) CHARACTER  SET LATIN NOT CASESPECIFIC, 
col2 VARCHAR(400) CHARACTER SET LATIN NOT CASESPECIFIC,
.
.
col23  TIMESTAMP(0) WITH TIME ZONE FORMAT 'YYYY-MM-DD HH:MI:SSZ', 
col24 TIMESTAMP(0) WITH TIME ZONE FORMAT 'YYYY-MM-DD HH:MI:SSZ'
);

.LABEL SKIP_CREATION
.LOGOFF

My python code to move the df to teradata is:我将 df 移动到 teradata 的 python 代码是:

df=some data frame
host,username,password = 'host','username', "password"
num_of_chunks = 1000
insert_query= "INSERT INTO abc.sample VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
udaExec = teradata.UdaExec (appName="IMC", version="1.0", logConsole=False)
with udaExec.connect(method="odbc",system=host, username = username,
                         password=password, driver="Teradata") as session:
    file_exist=session.execute(file=r"Path of the bteq file" ,fileType="bteq",ignoreErrors=[3803])
    schedule_chunks = np.array_split(df, num_of_chunks)

    for i,_ in enumerate(schedule_chunks):

        data = [tuple(x) for x in schedule_chunks[i].to_records(index=False)]

            session.executemany(insert_query, data,batch=True) 

When I run this i get the following error message:当我运行它时,我收到以下错误消息:

DatabaseError: [HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Invalid timestamp. DatabaseError: [HY000] [Teradata][ODBC Teradata Driver][Teradata Database] 时间戳无效。

Can someone help me with whee am i going wrong?有人可以帮我解决我错了吗? Also need some suggestion if I am writing the bteq script correctly.如果我正确编写 bteq 脚本,还需要一些建议。 I want to avoid dropping tables and creating a new one each time.我想避免每次删除表并创建一个新表。

I was able to push my dataframe into Teradata successfully.我能够成功地将我的 dataframe 推送到 Teradata。 All i did was convert my Timestamp columns in my dataframe from datetime64 to object.我所做的只是将 dataframe 中的时间戳列从 datetime64 转换为 object。 Below is the only line of code i added before running the above code以下是我在运行上述代码之前添加的唯一代码行

df=df.astype(object).where(pd.notnull(df),'')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM