简体   繁体   English

选择列在一行中包含三个大写字母的位置

[英]Select where column contains three uppercase letters in a row

I'm using SQL Server 2019 and trying to debug my ProperCase function that converts strings to proper case.我正在使用 SQL Server 2019 并尝试调试将字符串转换为正确大小写的ProperCase函数。

I have a table which I used my function to create, containing all varchar columns, one of them is named Surname .我有一个使用函数创建的表,其中包含所有varchar列,其中一个名为Surname

I want to select the rows where Surname contains three or more uppercase letters in a row.我想选择Surname在一行中包含三个或更多大写字母的行。

I've searched this site and Google etc and there's plenty of examples for finding where there are any uppercase letters or no uppercase letters in a field, but this is a little more subtle than that.我搜索这个网站和谷歌等,并有大量的例子查找那里有任何大写字母或在一个领域没有大写字母,但是这是一个有点比这更微妙。

The column can contain any total number of upper or lowercase letters, but I only want to select the rows where it contains three or more uppercase letters next to one another.该列可以包含任意总数的大写或小写字母,但我只想选择包含三个或更多大写字母的行。

Could a regular expression work here?正则表达式可以在这里工作吗?

Not really any regex support in SQL Server natively unless you want to install custom CLR objects.除非您想安装自定义 CLR 对象,否则 SQL Server 本身并没有真正的任何正则表达式支持。 If your data is stored as case insensitive and you want to perform case sensitive searches, one way is to use the COLLATE clause against the column.如果您的数据存储为不区分大小写并且您想要执行区分大小写的搜索,一种方法是对列使用COLLATE子句。

DECLARE @x TABLE(i int, surname nvarchar(500));

INSERT @x(i, surname) VALUES
    (1, 'this is not a match'),
    (2, 'this is a MATCH'),
    (3, 'this is not a match'),
    (4, 'this is DEFINITELY a match');
    
DECLARE @min int = 3;

SELECT i, surname
  FROM @x
  WHERE surname COLLATE Latin1_General_BIN2 
  LIKE N'%' + REPLICATE(N'[A-Z]', @min) + N'%';

Results:结果:

i      surname
----   -----------------------------
2      this is a MATCH
4      this is DEFINITELY a match

This dbfiddle also demonstrates other values for @min (in case you want to identify 4, or 40, or 300 consecutive upper-case characters in a row). 此 dbfiddle还演示了@min其他值(如果您想在@min中识别@min或 300 个连续的大写字符)。

Note this won't perform well, so hopefully it's not something you're doing often and at scale.请注意,这不会表现得很好,所以希望这不是你经常和大规模做的事情。 Also it won't find a surname with other characters between the upper-case characters, like Van DE Moor or MC-Adams .此外,它不会在大写字符之间找到带有其他字符的姓氏,例如Van DE MoorMC-Adams Not that those are normal, but data isn't normal, and want you to understand the bill of goods.不是那些是正常的,而是数据不正常,要你了解货单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM