简体   繁体   English

SQL Server通配符选择

[英]SQL Server wildcard select with a twist

I am extracting some wildcards from a string type column using certain keywords but for certain keywords in my list i am getting some false positives which I do not want in my output. 我正在使用某些关键字从字符串类型列中提取一些通配符,但对于列表中的某些关键字,我得到了一些误报,而我在输出中不希望这样做。 Some of the keywords in my wildcard select is 'old', 'older' and 'age' 我的通配符选择中的一些关键字是“ old”,“ older”和“ age”

select * from DESCRIPTIONS..LONG
where (DESCR like '% old %'
or DESCR like '% older %'
or DESCR like '% age %'
or DESCR like '%old%'
or DESCR like '%older%'
or DESCR like '%age%')

I want to extract only rows that contain these absolute words but I end up returning strings that include 'management', 'image', 'cold', 'colder' etc. I could remove these true negatives by not looking for the below 我只想提取包含这些绝对词的行,但最终返回的字符串包括“ management”,“ image”,“ cold”,“ colder”等。

DESCR like '%old%'
or DESCR like '%older%'
or DESCR like '%age%'

but in that process I am excluding rows that have special characters like period, comma, slash etc. which are true positives Eg i would miss strings ending in 'age.' 但在此过程中,我排除了包含特殊字符(如句点,逗号,斜杠等)的行,这些行是真正的肯定,例如,我会错过以“ age”结尾的字符串。 or 'old.' 或“旧”。 or 'older,' or 'age' when it is the last word in the string without a trailing space. 或“较旧”或“年龄”(如果它是字符串中的最后一个单词,且没有结尾空格)。

How do I exclude true negatives and false positives and only get all true positives? 如何排除真假阴性和假阳性,只得到所有真阳性?

here is a complete list of my keywords separated by a comma. 这是我的关键字的完整列表,以逗号分隔。

keywords: newborn, newborns, infant, infants, year, years, child, children, adult, adults, pediatric, old, older, young, younger, age 关键字:新生儿,新生儿,婴儿,婴儿,年,年,儿童,儿童,成人,成人,儿科,年龄,年龄,年轻,年轻,年龄,年龄

Thanks 谢谢

Assuming that spaces delineate the words, you can use this trick: 假设用空格分隔单词,则可以使用以下技巧:

select *
from DESCRIPTIONS..LONG
where ' ' + DESCR + ' ' like '% old %' or
      ' ' + DESCR + ' ' like '% older %' or
      ' ' + DESCR + ' ' like '% age %';

I suggest you to start looking at the LIKE Syntax from Microsoft: https://msdn.microsoft.com/en-us/library/ms179859.aspx 我建议您开始从Microsoft看一下LIKE语法: https : //msdn.microsoft.com/zh-cn/library/ms179859.aspx

Are you searching for a free text field? 您在搜索自由文本字段吗? You could use the [] syntax: 您可以使用[]语法:

SELECT FROM DESCRIPTION..LONG
    WHERE DESCR LIKE '%[ "\/-]age[,.:;'' "\/-]%'

You put inside square brackets everything you accept in that position, resolving your issues with punctuation. 您将放在该位置的所有内容放在方括号内,以解决标点符号问题。

You need to make your regular expression more complicated: 您需要使正则表达式更加复杂:

LIKE '%[^a-z]old[^a-z]%'

What this will do is search for the word "old" without any letters directly before or after it (the ^ has the meaning of "not" in regular expressions). 这样做是在单词“ old”之前或之后直接搜索没有任何字母的单词(在正则表达式中^的含义为“ not”)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM