简体   繁体   English

redshift - 如何插入表生成的时间序列

[英]redshift - how to insert into table generated time series

I am trying to generate time series in Redshift and insert into table, but no luck.我正在尝试在 Redshift 中生成时间序列并插入表中,但没有运气。 What I have tried so far:到目前为止我已经尝试过:

insert into date(dateid,date)
SELECT
    to_char(datum, 'YYYYMMDD')::int AS dateid,
    datum::date AS date
FROM (
    select '1970-01-01'::date + generate_series(0, 20000) as datum
     ) tbl;

Getting the following error收到以下错误

SQL Error [500310] [0A000]: [Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;

Any ideas or workaround ?有什么想法或解决方法吗?

The issue is that generate_series() can be run on the Leader node, but not on a compute node .问题是generate_series()可以在领导节点上运行,但不能在计算节点上运行。

Therefore, it is possible to run a statement like this:因此,可以运行如下语句:

SELECT '1970-01-01'::date + generate_series(0, 20000)

However, it is not possible to use that statement in a FROM because that would involve the compute nodes.但是,不可能在FROM中使用该语句,因为这将涉及计算节点。

Solution: Create a table of information externally and load the results into a date table, or use generate_series() directly to generate the desired values, save the results and import them into a date table.解决方案:在外部创建信息表并将结果加载到date表中,或者直接使用generate_series()生成所需的值,将结果保存并导入到date表中。

The generate_series function does not work on Redshift generate_series 函数在 Redshift 上不起作用

it does however generate_confusion!但是它确实会产生混淆! This is because it looks like its working fine until you try to write to a table.这是因为在您尝试写入表之前,它看起来工作正常。

Technically, the reason for this is that it works ONLY on the leader node.从技术上讲,这样做的原因是它只能在领导节点上工作。 and the leader node itself does not have access to the tables directly.并且领导节点本身没有直接访问表的权限。

If you try如果你试试

SELECT CURRENT_DATE::TIMESTAMP - (i * interval '1 day') as date_datetime
FROM generate_series(1,31) i
ORDER BY 1;

That produces a nice table output, because it has run on the leader node.这会产生一个很好的表输出,因为它已经在领导节点上运行。

HOWEVER if you run但是,如果您运行

create table test as
SELECT CURRENT_DATE::TIMESTAMP - (i * interval '1 day') as date_datetime
FROM generate_series(1,31) i
ORDER BY 1;

This fails with这失败了

[2019-03-11 07:38:00] [0A000] ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables. [2019-03-11 07:38:00] [0A000] 错误:Redshift 表不支持指定的类型或函数(每个 INFO 消息一个)。 [2019-03-11 07:38:00] [00000] Function "generate_series(integer,integer)" not supported. [2019-03-11 07:38:00] [00000] 不支持函数“generate_series(integer,integer)”。

This is because we are now trying to create a table, so the sql has to be run on the worker nodes (not the leader node) and so it fails.这是因为我们现在正在尝试创建一个表,所以 sql 必须在工作节点(而不是领导节点)上运行,所以它失败了。

As a workaround, you may generate the integer sequence using a query like the following:作为一种解决方法,您可以使用如下查询生成整数序列:

WITH
digits AS (
    SELECT 0 AS v
    UNION SELECT 1
    UNION SELECT 2
    UNION SELECT 3
    UNION SELECT 4
    UNION SELECT 5
    UNION SELECT 6
    UNION SELECT 7
    UNION SELECT 8
    UNION SELECT 9
), 
seq AS (
    SELECT 10000*tens_of_thousands.v + 1000*thousands.v + 100*hundreds.v + 10*tens.v + units.v AS seq
    FROM digits as units
    CROSS JOIN digits AS tens
    CROSS JOIN digits AS hundreds
    CROSS JOIN digits AS thousands
    CROSS JOIN digits AS tens_of_thousands
    ORDER BY seq
)
SELECT * FROM seq;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM