简体   繁体   English

将数据从 Amazon redshift 卸载到 Amazon s3

[英]Unloading data from Amazon redshift to Amazon s3

I am trying to use the following code to unload data into S3 bucket.我正在尝试使用以下代码将数据卸载到 S3 存储桶中。 Which works but after unloading it throws some error.哪个有效,但卸载后会引发一些错误。

 Properties props = new Properties();
 props.setProperty("user", MasterUsername);
 props.setProperty("password", MasterUserPassword);
 conn = DriverManager.getConnection(dbURL, props);
 stmt = conn.createStatement(); 
 String sql;
 sql = "unload('select * from part where p_partkey in (select p_partkey from 
       part limit 10)') to"
       + " 's3://redshiftdump.****' "
       + " DELIMITER AS ','"
       + "ADDQUOTES " 
       + "NULL AS ''"
       + "credentials 'aws_access_key_id=****;aws_secret_access_key=***' "
       + "parallel off" + 
       ";"; 
 boolean i = stmt.execute(sql);
 stmt.close();
 conn.close();

The unloading works.卸货工作。 It is creating a file in the bucket.它正在存储桶中创建一个文件。 But it is giving me some error但它给了我一些错误

   java.sql.SQLException: 
      dataengine.impl.DSISimpleRowCountResult cannot be cast to 
      com.amazon.dsi.dataengine.interfaces.IResultSet
   at 
   com.amazon.redshift.core.jdbc42.PGJDBC42Statement.createResultSet(Unknown 
   Source)
   at com.amazon.jdbc.common.SStatement.executeQuery(Unknown Source)

what is this error and how to avoid it?这是什么错误以及如何避免它? Is there any way to dump the table in CSV format.有没有办法以CSV格式转储表格。 Right now it is dumping the file in FILE format.现在它正在以 FILE 格式转储文件。

You say the UNLOAD works but you receive this error, that suggests to me that you are connecting successfully but there is an problem in the way your code interacts with the JDBC driver when the query completes.您说 UNLOAD 有效,但您收到此错误,这向我表明您连接成功,但在查询完成时您的代码与 JDBC 驱动程序交互的方式存在问题。

We provide an example that may be helpful in our documentation on the page "Connect to Your Cluster Programmatically"我们在“以编程方式连接到您的集群”页面上的文档中提供了一个示例

Regarding the output file format, you will get whatever is specified in your UNLOAD SQL but the filename will have a suffix (for example "000" or "6411_part_00") to indicate which part of the UNLOAD it is.关于输出文件格式,您将获得 UNLOAD SQL 中指定的任何内容,但文件名将有一个后缀(例如“000”或“6411_part_00”)以指示它是 UNLOAD 的哪一部分。

use executeUpdate .

  def runQuery(sql: String)  = {
    Class.forName("com.amazon.redshift.jdbc.Driver")
    val connection = DriverManager.getConnection(url, username, password)
    var statement: Statement = null
    try {
      statement = connection.createStatement()
      statement.setQueryTimeout(redshiftTimeoutInSeconds)
      val result = statement.executeUpdate(sql)
      logger.info(s"statement response code : ${result}")
    } catch {
      case e: Exception => {
        logger.error(s"statement.isCloseOnCompletion :${e.getMessage} ::: ${e.printStackTrace()}")
        throw new IngestionException(e.getMessage)
      }
    }
    finally {
      if(statement != null ) statement.close()
      connection.close()
    }
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM