[英]Calculate average of values between 2 columns sql
I have a table called validation_errors that looks like this: 我有一个名为validation_errors的表,如下所示:
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| link | varchar(200) | NO | MUL | NULL | |
| message | varchar(500) | NO | | | |
| explanation | mediumtext | NO | | NULL | |
| type | varchar(50) | NO | | | |
| subtype | varchar(50) | NO | | | |
| message_id | varchar(50) | NO | | | |
+-------------+--------------+------+-----+---------+----------------+
Link table looks like this: 链接表如下所示:
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| link | varchar(200) | NO | PRI | NULL | |
| visited | tinyint(1) | NO | | 0 | |
| validated | tinyint(1) | NO | | 0 | |
+-----------+--------------+------+-----+---------+-------+
I wish to calculate the average number of validation errors per page per topdomain. 我希望计算每个顶级域每页验证错误的平均数量。 I have a query that can fetch the amount of pages per topdomain:
我有一个查询,可以获取每个topdomain的页面数量:
SELECT substr(link, - instr(reverse(link), '.')) as domain , count(*) as count
FROM links
GROUP BY domain
ORDER BY count desc
limit 30;
And have a sql query that can fetch the amount of validation errors per top domain: 并具有一个可以查询每个顶级域验证错误数量的sql查询:
SELECT substr(link, - instr(reverse(link), '.')) as domain ,count(*) as count
FROM validation_errors
GROUP BY domain
ORDER BY count desc
limit 30;
What i now need to do is combine them into a query and divise the results of one column with the other and i can't figure out how to do it. 我现在需要做的是将它们组合成一个查询,并将一列的结果与另一列分开,而我不知道该怎么做。
Any help would be greatly apriciated. 任何帮助将不胜感激。
First, use substring_index()
, rather than your construct. 首先,使用
substring_index()
而不是您的构造。 Here is the query to join them together: 这是将它们连接在一起的查询:
select domain, sum(numviews) as numviews, sum(numerrors) as numerrors,
sum(numerrors) / nullif(sum(numviews), 0) as error_rate
from ((SELECT substring_index(link, '.', -1) as domain , count(*) as numviews, 0 as numerrors
FROM links
GROUP BY domain
) UNION ALL
(SELECT substring_index(link, '.', -1) as domain , 0, count(*)
FROM validation_errors
GROUP BY domain
)
) d
GROUP BY domain;
With both variables, I don't know which 30 you want to choose, so I haven't included an order by
. 对于这两个变量,我不知道您要选择哪30个,因此我没有按列出
order by
。
Note that this doesn't use a join
, it uses union all
with aggregation. 请注意,这不使用
join
,它使用带有聚合的union all
并集。 This ensures that you will get all domains, even those with no views and those with no errors. 这样可以确保您将获得所有域,即使是没有视图和没有错误的域。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.