简体   繁体   English

在SQL Server中查找关联

[英]Finding correlations in SQL Server

Just want to know if the following can be done ENTIRELY in SQL Server 只想知道是否可以在SQL Server中完全执行以下操作

I have a table which has 3 columns - SENTENCE ID (PK) , SENTENCE (strings of arbitrary length), PATTERNS (these are 2 or 3 word patterns which are found in the SENTENCE ). 我有一个包含3列的表SENTENCE ID (PK)SENTENCE (任意长度的字符串), PATTERNS (这些是在SENTENCE中找到的2或3个单词模式)。

I need to find the correlation of all the distinct PATTERNS with each other. 我需要找到所有不同PATTERNS的相关性。

If I do it externally (using python and ODBC) I need to go through the following steps 如果我从外部进行操作(使用python和ODBC),则需要执行以下步骤

FOR each distinct PATTERN 对于每个不同的图案

  1. Get the count of PATTERN 获取PATTERN的数量
  2. Find all the sentence IDs that have that PATTERN 查找所有具有该模式的句子ID
  3. Get counts of all PATTERNS that occur in the above sentence IDs 获取上述句子ID中发生的所有模式的计数
  4. Append the current PATTERN and its count (as columns) to the table in step3. 在步骤3中将当前PATTERN及其计数(作为列)追加到表中。
  5. keep appending the above table as rows to the result table 继续将上表作为行添加到结果表

Next 下一个

Let me assume that PATTERN follows the form of a like expression. 让我假设PATTERN遵循like表达式的形式。 And, that you want to count a pattern for a sentence only once. 而且,您只想为一个句子的模式计数一次。

If so, you can do the following. 如果是这样,您可以执行以下操作。 Get the matches between all sentences and patterns: 获取所有句子和模式之间的匹配项:

with sp as (
       select s.sentenceID, p.pattern, count(*) over (partition by p.pattern) as NumSentences
       from Sentences s join
            Patterns p
            on s.sentence like p.pattern
     )
select sp1.pattern, sp2.pattern,
       sp1.pattern as Pattern1Count, sp2.pattern as Pattern2Count,
       count(*) as BothCount
from sp sp1 join
     sp sp2
     on sp1.pattern < sp2.pattern    -- <= if you want counts for a single pattern
group by sp1.pattern, sp2.pattern

You don't explicitly say what kind of output you want, but this should be sufficient. 您没有明确说出想要哪种输出,但这应该足够了。

So, with some reasonable assumptions, you can do this in SQL. 因此,基于一些合理的假设,您可以在SQL中执行此操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM