简体   繁体   English

如何根据 postgres 表中的出现频率计算百分位数?

[英]How do I calculate percentile based on frequency of occurrence in a postgres table?

I have a table called votes .我有一个名为votes的表。 It includes three columns: voter_id, candidate_id, and is_citizen (bool).它包括三列:voter_id、candidate_id 和 is_citizen (bool)。

Voters can vote many times.选民可以多次投票。 Each time a voter votes for a candidate, it adds an entry into the table, populating the voter_id, the candidate_id, and is_citizen indicating whether or not the voter is a citizen.每次选民投票给候选人时,它都会在表中添加一个条目,填充 voter_id、candidate_id 和 is_citizen,表明选民是否是公民。 Voters may appear many times in the table, if they voted for many candidates.如果选民投票给许多候选人,他们可能会在表中出现多次。 Candidates may appear many times in the table if many people voted for them.如果有很多人投票给候选人,候选人可能会在表格中出现多次。 Each voter_candidate pairing must be unique.每个 voter_candidate 配对必须是唯一的。

Given a candidate_id, I want to figure out what percent rank that candidate is based on how many times they appear in the table.给定一个 candidate_id,我想根据候选人在表中出现的次数计算出该候选人的排名百分比。 For example, let's say we have three total candidates: candidate_ids: 1, 2, and 3. Candidate 1 got 5 votes from citizens, candidate 2 got 7, and candidate 3 got 20. Note: I don't want to factor in any votes from noncitizens.例如,假设我们总共有三个候选人:candidate_ids:1、2 和 3。候选人 1 获得了公民的 5 票,候选人 2 获得了 7 票,候选人 3 获得了 20 票。注意:我不想考虑任何因素来自非公民的选票。

Given candidate_id 2, it should return.5, as candidate 2 was in the 50th percentile by frequency of occurrence (not by total number of votes)给定 candidate_id 2,它应该返回 .5,因为候选人 2 按出现频率(而不是按总票数)排在第 50 个百分位

I've been workshopping it, and this is as far as I've gotten, but it's still giving me errors:(我一直在研究它,这是我所得到的,但它仍然给我错误:(

SELECT
  candidate_id,
  PERCENT_RANK() WITHIN GROUP (ORDER BY COUNT(*) DESC)
FROM votes
GROUP BY candidate_id
HAVING candidate_id = <candidate_id>;
with candidate_rank as (
select candidate_id,
       percent_rank() over (order by count(*) desc) pct_rank
  from votes
 where is_citizen = 1
 group by candidate_id)
select candidate_id, pct_rank
  from candidate_rank
 where candidate_id = <candidate_id>;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM