簡體   English   中英

使用 Snowflake 中的子查詢在“with”語句中未識別標識符

[英]identifier not being recognized in a 'with' statement using a subquery in Snowflake

所以我有一張類似這樣的表:

DELIVERY_AREA_ID,DELIVERY_RADIUS_METERS,EVENT_STARTED_TIMESTAMP
234sfd,4000,2020-01-01 12:19:29.719
234sfd,6500,2020-01-01 12:31:40.325
234sfd,3500,2020-01-01 12:53:10.538
234sfd,6500,2020-01-01 13:11:36.094
234sfd,3500,2020-01-01 13:32:26.754
234sfd,6500,2020-01-01 13:59:11.104
234sfd,6500,2020-01-02 07:44:16.792
234sfd,3500,2020-01-02 08:07:36.284
234sfd,6500,2020-01-02 08:54:08.014
234sfd,3500,2020-01-02 09:53:05.853
234sfd,6500,2020-01-02 10:04:39.443
234sfd,10000,2020-07-01 08:29:20.194
234sfd,3500,2020-07-03 07:50:41.782
234sfd,10000,2020-07-03 08:33:14.695
234sfd,3500,2020-07-05 07:47:05.539
234sfd,10000,2020-07-05 07:53:13.930
234sfd,3500,2020-07-05 09:18:57.688
234sfd,10000,2020-07-05 09:51:07.547
234sfd,3500,2020-07-19 18:02:14.099

數據實際上更加多樣化,但是是的,它遵循這種格式。

我試圖在一個查詢中,在雪花數據庫中,創建一個“默認交付半徑”,它只是持續時間最長的按月/年計算的半徑,然后計算該月的總持續時間小於此默認值配送半徑。 目前通過加入。 我不想創建新表,但我知道那樣會更容易。

這是我目前的嘗試:

-- Find the default delivery radius for each delivery area
WITH default_radiuses AS (
SELECT DELIVERY_AREA_ID,
       MAX(DELIVERY_RADIUS_METERS) AS default_delivery_radius,
       MONTH_YEAR,
       DELIVERY_RADIUS_METERS,
       SUM(DURATION_SECONDS) AS total_duration,
       max(EVENT_STARTED_TIMESTAMP) as MAX_TIMESTAMP,
       RANK() OVER (PARTITION BY DELIVERY_AREA_ID, MONTH_YEAR ORDER BY total_duration DESC) AS RADIUS_RANK
FROM (
    -- Add the MONTH_YEAR column to the delivery_radius_log table
    SELECT DELIVERY_AREA_ID,
           DELIVERY_RADIUS_METERS,
           EVENT_STARTED_TIMESTAMP,
           CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/', YEAR(EVENT_STARTED_TIMESTAMP)) AS MONTH_YEAR,
           DATEADD(second, DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)), EVENT_STARTED_TIMESTAMP) AS end_timestamp,
           DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)) AS duration_seconds
    FROM delivery_radius_log
)
GROUP BY DELIVERY_AREA_ID, MONTH_YEAR, DELIVERY_RADIUS_METERS
ORDER BY RADIUS_RANK asc, month(MAX_TIMESTAMP)
)

-- Find the duration of each radius reduction
SELECT DELIVERY_AREA_ID,
       CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/', YEAR(EVENT_STARTED_TIMESTAMP)) AS MONTH_YEAR,
       DELIVERY_RADIUS_METERS,
       EVENT_STARTED_TIMESTAMP,
       DATEADD(hour, DATEDIFF(hour, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)), EVENT_STARTED_TIMESTAMP) AS end_timestamp,
       DATEDIFF(hour, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)) AS duration_hours
FROM delivery_radius_log
JOIN default_radiuses USING (DELIVERY_AREA_ID, MONTH_YEAR)
WHERE DELIVERY_RADIUS_METERS != default_delivery_radius

我目前收到錯誤“無效標識符 MONTH_YEAR”。 我該怎么做才能解決這個問題?

這兩個查詢是分開工作的,所以我想我只是缺少執行順序? 在我看來這是可行的,但我不確定

好的,在收到一些反饋后,我添加了一個別名,放棄了訂單,但我仍然認為我的加入有一些問題:

-- Find the default delivery radius for each delivery area
WITH default_radiuses AS (
    SELECT DELIVERY_AREA_ID,
           MAX(DELIVERY_RADIUS_METERS) AS default_delivery_radius,
           MONTH_YEAR,
           DELIVERY_RADIUS_METERS,
           SUM(DURATION_SECONDS) AS total_duration,
           MAX(EVENT_STARTED_TIMESTAMP) AS MAX_TIMESTAMP,
           RANK() OVER (PARTITION BY DELIVERY_AREA_ID, MONTH_YEAR
                        ORDER BY SUM(DURATION_SECONDS) DESC) AS RADIUS_RANK
    FROM (
        -- Add the MONTH_YEAR column to the delivery_radius_log table
        SELECT DELIVERY_AREA_ID,
               DELIVERY_RADIUS_METERS,
               EVENT_STARTED_TIMESTAMP,
               CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/',
                      YEAR(EVENT_STARTED_TIMESTAMP)) AS MONTH_YEAR,
               DATEADD(second, DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)), EVENT_STARTED_TIMESTAMP) AS end_timestamp,
               DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)) AS duration_seconds
        FROM delivery_radius_log
    ) t  -- added alias here
    GROUP BY DELIVERY_AREA_ID, MONTH_YEAR, DELIVERY_RADIUS_METERS
)


-- Find the duration of each radius reduction
SELECT a.DELIVERY_AREA_ID,
       CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/', YEAR(EVENT_STARTED_TIMESTAMP)) AS MONTH_YEAR,
       a.DELIVERY_RADIUS_METERS,
       EVENT_STARTED_TIMESTAMP,
       DATEADD(hour, DATEDIFF(hour, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY a.DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)), EVENT_STARTED_TIMESTAMP) AS end_timestamp,
       DATEDIFF(hour, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY a.DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)) AS duration_hours
FROM delivery_radius_log a
JOIN default_radiuses b on (a.DELIVERY_AREA_ID = b.DELIVERY_AREA_ID,CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/',
                      YEAR(EVENT_STARTED_TIMESTAMP)) = b.MONTH_YEAR)
WHERE a.DELIVERY_RADIUS_METERS != default_delivery_radius

我得到錯誤

Invalid data type [ROW(BOOLEAN, BOOLEAN)] for predicate [ROW(A.DELIVERY_AREA_ID = B.DELIVERY_AREA_ID, (CONCAT(CAST(EXTRACT(month from A.EVENT_STARTED_TIMESTAMP) AS VARCHAR(16777216)), '/', CAST(EXTRACT(year from A.EVENT_STARTED_TIMESTAMP) AS VARCHAR(16777216)))) = B.MONTH_YEAR)]

如果我在前面的 with 語句中使用相同的方法創建該字段,我不明白如何存在無效連接

每個派生表/子查詢都需要 SQL 中的別名。 此外,在您的 CTE 中使用ORDER BY子句沒有意義,因此應將其刪除。 此外,您不能在定義它的同一個選擇中引用該選擇中的total_duration別名。 相反,只需重復SUM()表達式。 我們可以嘗試進行這兩項更改:

WITH default_radiuses AS (
    SELECT DELIVERY_AREA_ID,
           MAX(DELIVERY_RADIUS_METERS) AS default_delivery_radius,
           MONTH_YEAR,
           DELIVERY_RADIUS_METERS,
           SUM(DURATION_SECONDS) AS total_duration,
           MAX(EVENT_STARTED_TIMESTAMP) AS MAX_TIMESTAMP,
           RANK() OVER (PARTITION BY DELIVERY_AREA_ID, MONTH_YEAR
                        ORDER BY SUM(DURATION_SECONDS) DESC) AS RADIUS_RANK
    FROM (
        -- Add the MONTH_YEAR column to the delivery_radius_log table
        SELECT DELIVERY_AREA_ID,
               DELIVERY_RADIUS_METERS,
               EVENT_STARTED_TIMESTAMP,
               CONCAT(MONTH(EVENT_STARTED_TIMESTAMP), '/',
                      YEAR(EVENT_STARTED_TIMESTAMP)) AS MONTH_YEAR,
               DATEADD(second, DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)), EVENT_STARTED_TIMESTAMP) AS end_timestamp,
               DATEDIFF(second, EVENT_STARTED_TIMESTAMP, LEAD(EVENT_STARTED_TIMESTAMP) OVER (PARTITION BY DELIVERY_AREA_ID ORDER BY EVENT_STARTED_TIMESTAMP)) AS duration_seconds
        FROM delivery_radius_log
    ) t  -- added alias here
    GROUP BY DELIVERY_AREA_ID, MONTH_YEAR, DELIVERY_RADIUS_METERS
)

SELECT ...
-- your query here

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM