[英]Normalize data in SQL query
我有一個SQL查詢A
(有關更多詳細信息,請參見下文),它返回一個表,如下所示:
cluster brand amount
0 bos 600
0 phi 300
0 har 100
1 pro 2500
1 wal 1500
1 ash 1000
2 dil 4200
2 sor 500
2 van 300
...
但是,我不想顯示金額,而是顯示該金額相對於該群集中的總金額的比例,如下表所示:
cluster brand amount
0 bos 0.60
0 phi 0.30
0 har 0.10
1 pro 0.50
1 wal 0.30
1 ash 0.20
2 dil 0.84
2 sor 0.10
2 van 0.06
...
如何更改我的SQL,以便我可以訪問一個群集中所有金額的總和,並且在同一群集中仍然有多行?
** 細節 **
SQL Server:MySQL,通過python-MySQL連接器進行接口。
當前的SQL查詢生成第一個表:
SELECT c.cluster, brand, COUNT(o.id) AS brand_amount
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster, brand
HAVING brand_amount > 100
ORDER BY c.cluster ASC, brand_amount DESC;
表orders
(主鍵id
)將persons
(外鍵pid
)與articles
(外鍵aid
)聯系起來。 Articles
具有特定品牌(外鍵brand_id
),該brands
與Table brands
的名稱相關。
可以使用以下SQL查詢檢索每個群集的文章總數:
SELECT c.cluster, COUNT(o.pid) AS amount
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster
ORDER BY c.cluster ASC, amount DESC;
結果:
cluster amount
0 1000
1 5000
2 5000
但是,我似乎無法結合兩個SQL查詢。
您可以對子查詢進行聯接,以按集群求和
select t1.cluster, amount / sumAmount
from Table1 t1
join (select cluster, sum(amount) as sumAmount
from Table1
group by cluster)s
on t1.cluster = s.cluster
編輯
SELECT
c.cluster,
brand,
COUNT(o.id) / coalesce(s.sumBrandAmount, 0) AS brand_amount -- of course it would be nice to check for dividing by 0
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id
LEFT JOIN (select c1.id, count(o1.id) as sumBrandAmount
from nyon_all.clustering c1
left join nyon_all.persons p1 on p1.id = c1.pid
left join nony_all.orders as o1 on o1.id = p1.id
--maybe some where clause as in your main query
group by c1.id) s
ON s.id = c.id
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster, brand
HAVING brand_amount > 100
ORDER BY c.cluster ASC, brand_amount DESC;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.