簡體   English   中英

計算每一行的頻率

[英]Calculate frequency for each row

我正在嘗試計算每行中元素的頻率,我將解釋:我從包含一些元素的表中選擇,例如“pos,chr,ref,alt,id_disease”。

從這些我將不得不提取我的ref的頻率,alt是:

num_occurrencies_of(ref='A' and alt='C')/total number of rows

有了這個查詢,我幾乎沒有接近我的目標,實際上id沒有正確計算它返回的頻率總是a constant

SELECT pos, chr, upper(ref||' '||alt) AS refalt, id_disease AS lvl15, t1.tot_var, t1.freq
FROM varianti 
JOIN ( SELECT count(*) AS tot_var,(count(*)::numeric / sum(count(*)) over ()) as freq
       FROM varianti)t1 ON TRUE
WHERE length(ref)=1 AND length(alt)=1 AND chr similar to 'chr[\d X Y]*'

我想要的只是檢索這樣的數據:

chr pos refalt lvl15 freq tot_var
1   120  AT     15    0.3  1000
1   150  CG     30    0.01 1000

tot_var =計算我需要它的行的總數(它不能是1我計算每一行!)

ref和alt都可以在每個排列中使用那些值(A,T,C,G),AA,AT,TA,TC,CT等。

我的代碼中缺少什么?

如果您想要更多信息,請告訴我


varianti的例子:

chr pos ref alt id_disease
chr1 152 A   C    15
chr3 487 T   T    74

這是我的查詢的輸出:

pos          chr    refalt  lvl15   tot_var freq
124338543   chr11   G A      69      1     0.000000677833751782702767
124338595   chr11   C T      28      1      0.000000677833751782702767
124361862   chr11   C .      53      1     0.000000677833751782702767
124361899   chr11   T A      20      1     0.000000677833751782702767

根據您提供的信息

SELECT DISTINCT chr, pos, 
upper(ref||' '||alt) AS refalt, id_disease AS lvl15, 
SUM(CASE WHEN (ref == 'A' AND alt == 'C')THEN 1 ELSE 0 END)/COUNT(*) AS 'freq', 
COUNT(*) AS 'tot_var' 
FROM varianti

我仍然不確定'tot_var'是什么。 獲得實際數據樣本以及該數據樣本本身的預期輸出將是有用的。

編輯1:獲取數據集中每對的頻率

SELECT DISTINCT upper(ref||' '||alt) AS refalt, 
COUNT(chr)/COUNT(*) AS 'freq' 
FROM varianti 
GROUP BY refalt

編輯2:根據要求更新了查詢

SELECT varianti.chr, varianti.pos, 
upper(varianti.ref||' '||varianti.alt) AS refalt, varianti.id_disease AS lvl15, COUNT(*) AS 'tot_var', 
FROM varianti
JOIN 
( SELECT DISTINCT upper(ref||' '||alt) AS refalt, 
  COUNT(chr)/COUNT(*) AS 'freq' 
  FROM varianti 
  GROUP BY refalt 
) refalt_table ON refalt_table.refalt = varianti.refalt

編輯3:基於錯誤更新了查詢

SELECT chr, pos, upper(ref||' '||alt) as refalt, id_disease AS lvl15, refalt_table.freq as 'freq', (SELECT COUNT(*) FROM varianti tot where tot.pos = v.pos) as 'tot_var'
FROM varianti v
LEFT JOIN 
( SELECT DISTINCT UPPER(ref) as 'ref',UPPER(alt) as 'alt', 
  COUNT(pos)/(SELECT COUNT(*) FROM varianti vcount) AS 'freq' 
  FROM varianti
  GROUP BY ref,alt
) refalt_table ON refalt_table.ref = v.ref and refalt_table.alt = v.alt

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM