I have a table- supplynetwork including four columns:
CustomerID, SupplierID, Supplier_productID, Purchase_Year
.
I want to construct a customer-pair where both customers purchase same product from the same supplier in a focal year. I use the self-join
to do this in BigQuery
.But it is too slow. Any alternative?
select distinct
a.CustomerID as focal_CustomerID,
b.CustomerID as linked_CustomerID,
a.Purchase_Year,
a.Supplier_productID
from
supplynetwork as a,
supplynetwork as b
where
a.CustomerID<>b.CustomerID and
a.Purchase_Year=b.Purchase_Year and
a.Supplier_productID=b.Supplier_productID and
a.SupplierID=b.SupplierID
use join syntax and do index CustomerID column
select distinct
a.CustomerID as focal_CustomerID,
b.CustomerID as linked_CustomerID,
a.Purchase_Year,
a.Supplier_productID
from
supplynetwork as a join
supplynetwork as b
on
a.Purchase_Year=b.Purchase_Year and
a.Supplier_productID=b.Supplier_productID and
a.SupplierID=b.SupplierID
where a.CustomerID<>b.CustomerID
You can use aggregation to get all customers that meet the conditions in a single row:
select Purchase_Year, Supplier_productID, SupplierID,
array_agg(distinct CustomerID) as customers
from supplynetwork sn
group by Purchase_Year, Supplier_productID, SupplierID;
You can then get pairs using array operations:
with pss as (
select Purchase_Year, Supplier_productID, SupplierID,
array_agg(distinct CustomerID) as customers
from supplynetwork sn
group by Purchase_Year, Supplier_productID, SupplierID
)
select c1, c2, pss.*
from pss cross join
unnest(pss.customers) c1 cross join
unnest(pss.customers) c2
where c1 < c2;
You can use CROSS JOIN
, which (even though does a cartesian) can probably give you a benefit of simplicity. Try this query below and see if it's cheaper than your baseline:
select
focal_CustomerID,
linked_CustomerID,
Purchase_Year,
Supplier_ProductID
from (
select
SupplierID,
Supplier_ProductID,
Purchase_Year,
array_agg(distinct CustomerID) as Customers
from `mydataset.mytable`
group by 1,2,3
), unnest(Customers) focal_CustomerID
cross join unnest(Customers) linked_CustomerID
where focal_CustomerID != linked_CustomerID
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.