简体   繁体   English

REGEXP_SUBSTR SQL Server

[英]REGEXP_SUBSTR SQL Server

With T as
( 
select 'Cytomegalovirus Nucleoside Analog DNA Polymerase Inhibitor [EPC],DNA Polymerase Inhibitors [MoA],Nucleoside Analog [Chemical/Ingredient],Nucleoside Analog Antiviral [EPC]' CLASS 
FROM DUAL )

Need to pull strings with [EPC]. 需要用[EPC]拉线。

Desired Output: 所需输出:

Cytomegalovirus Nucleoside Analog DNA Polymerase Inhibitor [EPC], Nucleoside Analog Antiviral [EPC]

You do not need true regexes, since your split string is a literal. 您不需要真正的正则表达式,因为您的拆分字符串是文字。 On SQL Server 2017, this should work (not tested): 在SQL Server 2017上,这应该可以正常工作(未经测试):

SELECT STRING_AGG([value], ',')
FROM STRING_SPLIT('Cytomegalovirus Nucleoside Analog DNA Polymerase Inhibitor [EPC],DNA Polymerase Inhibitors [MoA],Nucleoside Analog [Chemical/Ingredient],Nucleoside Analog Antiviral [EPC]', ',')
WHERE [value] LIKE '%\[EPC\]%' ESCAPE '\'

On SQL Server 2016, we lack STRING_AGG , but we do have STRING_SPLIT , so this will work (tested): 在SQL Server 2016上,我们缺少STRING_AGG ,但是我们有STRING_SPLIT ,因此这可以正常工作(经过测试):

SELECT STUFF((
    SELECT ',' + [value]
    FROM STRING_SPLIT('Cytomegalovirus Nucleoside Analog DNA Polymerase Inhibitor [EPC],DNA Polymerase Inhibitors [MoA],Nucleoside Analog [Chemical/Ingredient],Nucleoside Analog Antiviral [EPC]', ',')
    WHERE [value] LIKE '%\[EPC\]%' ESCAPE '\'
    FOR XML PATH('')
), 1, 1, '')

On earlier versions, lack of decently performing native string splitting makes this a giant pain in the behind. 在较早的版本中,缺乏像样的本机字符串拆分功能,这使后面的工作非常痛苦。 Many questions deal with that, like this one . 诸如此类的许多问题都在处理这个问题 Using any of those solutions in combination with the FOR XML PATH trick for concatenation will work. 可以将任何这些解决方案与FOR XML PATH技巧结合使用来进行连接。

Alternatively, a CLR function can do this cleaner and faster (and with true regexes), but implementing those is more involved. 或者,CLR函数可以使此操作更简洁,更快捷(并且使用真正的正则表达式),但是实现这些功能涉及更多。 This question is a good start for that. 这个问题是一个好的开始。

Last but certainly not least: if you have to perform these sorts of operations in SQL, this is usually a sign that your database design is lacking. 最后但并非最不重要的一点:如果必须在SQL中执行这些类型的操作,通常这表明您的数据库设计不足。 In particular, the comma-separated values should have been separate rows with an EPC BIT NOT NULL column (or more generally, a CategoryID INT REFERENCES Categories(ID) column), and you should consider ways to have your client code store data in a way that is amenable to efficient manipulation by your database. 特别是,逗号分隔的值应该是带有EPC BIT NOT NULL列的分隔行(或更常见的是CategoryID INT REFERENCES Categories(ID)列),并且您应该考虑将客户端代码存储在可以通过数据库进行有效操作的方式。 Normalization is the keyword here. 标准化是这里的关键词。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM