[英]Counting occurrences of features in sqlite
我有一个单词集及其频率的数据集,例如
w1 w2 w3 freq
a a a 4
a a and 3
a a band 1
a a well 1
a and a 2
我想根据下表获取观测值:
(w3) not(w3)
(w1,w2) n1 n2
not(w1,w2) n3 n4
其中n1,...,n4是满足条件的观测频率的总和。 例如,在第一个观察中,w1 = a,w2 = a,w3 = a。 现在,我们将检查w1 = a,w2 = a,w3 = a的所有观察值。 我们发现只有一个观测值满足该条件,其频率为4。接下来,我们做w1 = a,w2 = a,w3!= a,得出的观测值的频率为3,1,1,总和为5。现在我们将做w1!= a,w2!= a,w3 = a为0且w1!= a,w2!= a,w3!= a为0。
我想要一个表,将其输出为:
w1 w2 w3 freq n1 n2 n3 n4
a a a 4 4 5 0 0
a a and 3 3 6 0 0
a a band 1
a a well 1
a and a 2
etc.
如何使用sqlite3做到这一点?
这可以通过相关的标量子查询来完成:
SELECT w1,
w2,
w3,
freq,
(SELECT SUM(freq)
FROM MyLittleTable AS T2
WHERE T2.w1 = T1.w1
AND T2.w2 = T1.w2
AND T2.w3 = T1.w3
) AS n1,
(SELECT SUM(freq)
FROM MyLittleTable AS T2
WHERE T2.w1 = T1.w1
AND T2.w2 = T1.w2
AND T2.w3 != T1.w3
) AS n2,
...
FROM MyLittleTable AS T1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.