简体   繁体   中英

Calculation of Lead Time of Suppliers in BigQuery

I would like to calculate the lead time ( Delivery Date to Customer (table oli, below) - Order Date (table olit, below) ) of our suppliers in BigQuery.

In our ecommerce company; each of our customers may have an order from 1/many suppliers. Therefore we assign one SHIPMENT_NUMBER (table ol, below) per one supplier in a given order.

Therefore we calculate the lead time of a suppliers as the average lead time of SHIPMENT_NUMBERs.

For example, assuming there are 2 orders in total for Supplier A: There is order for Supplier A in Order X and Order Y; and lead time of the order for Supplier A in Order X (SHIPMENT_NUMBER_1) is 10 hours and lead time of the order for Supplier A in Order Y (SHIPMENT_NUMBER_2) is 30 hours; Lead Time of Supplier A --> (Lead Time of SHIPMENT_NUMBER_1 + Lead Time of SHIPMENT_NUMBER_2) / 2 = (10+30)/2 =20 hours.

A SHIPMENT_NUMBER is unique to a supplier in a given order, but in the meantime a SHIPMENT_NUMBER may comprise multiple order lines (table ol, below). For example SHIPMENT_NUMBER_1 may include two order lines. The lead time of line 1 is 5 hours, and the lead time of line 2 is 15 hours, then Lead Time of SHIPMENT_NUMBER_1 is (5+15)/2 = 10 hours.

I can easily calculate the lead time of SHIPMENT_NUMBERS in SQL with below code:

SELECT
    ol.SHIPMENT_NUMBER,
    avg(timestamp_diff(oli.DELIVERY_DATE, olit.ORDER_DATE, hour)) ORDERTOCUSTOMER
FROM
    ORDERLINE ol
    JOIN ORDERLINEITEMTRX olit on olit.order_line_sk = ol.ORDER_LINE_SK
    join ORDERLINEITEM oli ON olit.order_line_item_sk = oli.ORDER_LINE_ITEM_SK
WHERE
    s.SUPPLIER_ID = 'SupplierX'
group by
    SHIPMENT_NUMBER

The results are correct according to my manual control.

However, when I make the aggregation on Supplier_ID level, with below code, I get the wrong result. To simplify, I worked with only one supplier above and below. The result must have been 45,3619 Hours as per my manual control, however BigQuery reports 45,7695 Hours.

SELECT
    s.SUPPLIER_ID,
    AVG(OTC.ORDERTOCUSTOMER)
FROM
    ORDERLINE ol
    JOIN ORDERLINEITEMTRX olit on olit.order_line_sk = ol.ORDER_LINE_SK
    join ORDERLINEITEM oli ON olit.order_line_item_sk = oli.ORDER_LINE_ITEM_SK
    RIGHT JOIN SUPPLIER s ON s.SUPPLIER_SK = olit.SUPPLIER_SK
    INNER JOIN (
        SELECT
            ol.SHIPMENT_NUMBER,
            avg(timestamp_diff(oli.DELIVERY_DATE, olit.ORDER_DATE, hour)) ORDERTOCUSTOMER
        FROM
            ORDERLINE ol
            JOIN ORDERLINEITEMTRX olit on olit.order_line_sk = ol.ORDER_LINE_SK
            join ORDERLINEITEM oli ON olit.order_line_item_sk = oli.ORDER_LINE_ITEM_SK
        group by
            SHIPMENT_NUMBER
        WHERE s.SUPPLIER_ID = 'SupplierX'
    ) AS OTC ON ol.SHIPMENT_NUMBER = OTC.SHIPMENT_NUMBER
WHERE s.SUPPLIER_ID = 'SupplierX'
group by s.SUPPLIER_ID

What am I doing wrong? Sample dataset and expected results are as here: https://drive.google.com/file/d/1HdQkdhJxciHeHznTbie4bfzcIkRIHSfu/view?usp=sharing

Each shipment number may have multiple recurrences in the original order table due to the fact I stated above (one shipment number may have one/many order lines), therefore the challenge here is to find the average lead time of shipment numbers without overcounting, ie, finding the average over unique numbers.

I just added supplier_id to your first query, and then used this output.

WITH
shipment_lead_times as 
(
    SELECT
        s.SUPPLIER_ID,
        ol.SHIPMENT_NUMBER,
        avg(timestamp_diff(oli.DELIVERY_DATE, olit.ORDER_DATE, hour)) ORDERTOCUSTOMER
    FROM
        ORDERLINE ol
        JOIN ORDERLINEITEMTRX olit on olit.order_line_sk = ol.ORDER_LINE_SK
        join ORDERLINEITEM oli ON olit.order_line_item_sk = oli.ORDER_LINE_ITEM_SK
    WHERE
        s.SUPPLIER_ID = 'SupplierX'
    group by
        SHIPMENT_NUMBER,
        SUPPLIER_ID
)
select 
    supplier_id,
    count(*) as shipments,
    avg(ordertocustomer) as avg_leadtime
from shipment_lead_times
group by supplier_id

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM