简体   繁体   中英

Unable to UNLOAD the information schema from Redshift to S3

Is there a way to UNLOAD the information schema from Redshift to S3? Getting following error when trying to UNLOAD Information schema

ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables.

Following is the query I am using for unloading:

UNLOAD ('select table_schema,
      table_name
from information_schema.tables
where table_schema not in (\'information_schema\', \'pg_catalog\')
     and table_type = \'BASE TABLE\'
order by table_schema,
        table_name;')
TO 's3://xxx/'
iam_role 'arn:aws:iam::xxx';

You have two options that I know of.

  1. You can remove from your query the columns which have data types which are not supported on worker nodes (ie are leader-node only), which are the cause of the error you're having. (I'm not familiar with the information_schema views, since I never use them - I use the native tables/views - but there might also be functions called in the views which are leader-node only, and you won't be able to remove those, because they're part of the view SQL; just by being there, the query planner will refuse to let the query run).

  2. You can use something external to Redshift, such as Python on your computer, to connect to Redshift and issue the query. You will then be able to retrieve all data types, and then in your Python code emit them to S3.

BTW, a lot of the char columns in the system tables contain UTF-8. If you UNLOAD them, you will not be able to use COPY to load those files into empty duplicates of the original tables, as COPY will notice you are loading non-ASCII values. You will need to change the data types of the problematic char columns (which is most of them) to varchar .

This error message is often generated when leader-node-only data needs to be sent to compute nodes and there is no path for data to flow in this direction during a query. Typically happens when you try to join some system tables with user data. Now I've never generated this error the way you have but expect this is what is going on because UNLOAD works in parallel through all the compute nodes.

This may just work if you change your UNLOAD to PARALLEL OFF which will write the data to S3 through the leader node. Since the data doesn't have to go to compute nodes it could get you past this error.

If not there are ways to select the data from these system tables into a cursor and then read the cursor into a temp table. Then you could unload the temp table. I can point you to a process to do this if needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM