
[英]Produce equal length of rows for each value in another column (using Python or SQL)
[英]Equating a column in a database with equal rows using SQL
表结构:
uid : integer
answer_id : integer
我需要运行一个查询,它会告诉我哪个uid与其他uid有相同的答案。 例如,这里有一些测试数据:
answer_id uid
1 555
4 555
7 555
10 555
1 123
5 123
7 123
10 123
因此,我们可以从这些数据中看出,他们每个人都以同样的方式回答了3/4的问题。
我正在努力学习如何写一个查询,它会告诉我哪个uid匹配相同答案的3/4或4/4。 基本上我试图找到75%(3/4)或更高(4/4)类似答案的用户。
这是Ruby on Rails应用程序的一部分,所以我有所有构建的模型[User,UserAnswers等...]但我假设这只是一个SQL查询,不一定是ActiveRecord的一部分
此查询显示每个用户彼此共有的答案数:
declare @uid int
select
ans1.uid as user1,
ans2.uid as user2,
count(*)
from
ans ans1 inner join ans ans2
on ans1.answer_id = ans2.answer_id
and ans1.uid <> ans2.uid
where uid = @uid
group by user1, user2
having count(*)>0
这也显示了每个用户回答的问题数量:
select
ans1.uid as user1,
ans2.uid as user2,
count(distinct ans1.answer_id) as total1,
count(distinct ans2.answer_id) as total2,
sum(case when ans1.answer_id = ans2.answer_id then 1 else 0 end) as common
from
ans ans1 inner join ans ans2 on ans1.uid <> ans2.uid
group by user1, user2
having count(*)>0
(这第二个查询可能非常慢)
FThiella的答案有效。 但是,进行笛卡尔积加入是不必要的。 以下版本生成相同的计数,没有这种复杂的连接:
select ans1.uid as user1,
ans2.uid as user2,
max(ans1.numanswers) as total1,
max(ans2.numanswers) as total2,
count(*) as common
from (select a.*, count(*) over (partition by uid) as numanswers,
from UserAnswers a
) ans1 inner join
(select a.*, count(*) over (partition by uid) as numanswers
from UserAnswers a
) ans2
on ans1.uid <> ans2.uid and
ans1.answer_id = ans2.answer_id
group by ans1.uid, ans2.uid
与其他答案一样,这不包括没有共同答案的用户对。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.