简体   繁体   English

识别隐藏字符

[英]Identify Hidden Characters

In my SQL tables I have text which has hidden characters which is only visible when I copy and paste it in notepad++.在我的 SQL 表中,我的文本具有隐藏字符,只有在我将其复制并粘贴到记事本 ++ 时才可见。

How to find those rows which has hidden characters using SQL Server queries?如何使用 SQL 服务器查询找到那些具有隐藏字符的行?

I have tried comparing the lengths using datalength and len it did not work.我尝试使用datalengthlen比较长度,但它不起作用。

DATALENGTH(name) AS BinaryLength != LEN(name)

I want the row which has hidden characters.我想要有隐藏字符的行。

On the assumption that this is being caused by control characters.假设这是由控制字符引起的。 Some of which are invisible.其中一些是不可见的。 But also include tabs, newlines and spaces.但也包括制表符、换行符和空格。 An example to illustrate and how to get them to appear.一个示例来说明以及如何让它们出现。

--DROP TABLE #SillyTemp

DECLARE @InvisibleChar1 NCHAR(1) = NCHAR(28), @InvisibleChar2 NCHAR(1) = NCHAR(30), @NonControlChar NCHAR(1) = NCHAR(33);
DECLARE @InputString NVARCHAR(500) = N'Some |' + @InvisibleChar1 +'| random string |' + @InvisibleChar2 + '|' + '; Thank god Finally a normal character |' + @NonControlChar + '|'; 
SELECT @InputString AS OhNoInvisibleCharacters

DECLARE @ControlCharRange NVARCHAR(50) = N'%[' + NCHAR(1) + '-' + NCHAR(31) + ']%';

CREATE TABLE #SillyTemp
(
    input nvarchar(500)
)

INSERT INTO #SillyTemp(input)
VALUES (@InputString),(N'A normal string')

SELECT @ControlCharRange;
SELECT input FROM #SillyTemp AS #SI WHERE input LIKE @ControlCharRange;

This produces 3 results.这会产生 3 个结果。 A string with invisiblechars within them like such:一个带有 invisiblechars 的字符串,如下所示:

Some ||一些|| random string ||;随机字符串 ||; Thank god Finally a normal character |!|感谢上帝终于有一个正常的字符了|!|

Note, the are actually invisible inside SQL.请注意,在 SQL 内部实际上是不可见的。 But stackoverflow shows them as such.但是stackoverflow这样显示它们。 The output in SQL Server is simply. SQL服务器中的output很简单。

Some ||一些|| random string ||;随机字符串 ||; Thank god Finally a normal character |!|感谢上帝终于有一个正常的字符了|!|

But these characters still have a corresponding (N)CHAR(X) value.但是这些字符仍然有一个对应的 (N)CHAR(X) 值。 (N)CHAR(0) is a NULL character and is highly unlikely to be in a string, in my setup to detect them it also provides some problems in building a range. (N)CHAR(0) 是一个 NULL 字符,极不可能出现在字符串中,在我检测它们的设置中,它也为构建范围提供了一些问题。 (N)CHAR(32) is the ' ' space character. (N)CHAR(32) 是 ' ' 空格字符。

The way the [XY] string operator works is also based on the (N)CHAR numbers. [XY] 字符串运算符的工作方式也基于 (N)CHAR 数字。 Therefore we can make a range of [NCHAR(1)-NCHAR(31)]因此我们可以设定 [NCHAR(1)-NCHAR(31)] 的范围

The last select goes through the temporary table, one which has invisible characters.最后一个 select 通过临时表,其中包含不可见字符。 Since we're looking for any NCHARS between 1 and 31, only those with invisible characters (and often invalid characters or tabs/newlines) satisfy the where condition.由于我们正在寻找 1 到 31 之间的任何 NCHARS,因此只有那些具有不可见字符(通常是无效字符或制表符/换行符)的 NCHARS 满足 where 条件。 Thus only they get returned.因此只有他们被退回。 In this case only the 'faulty' string gets returned in my select statement.在这种情况下,我的 select 语句中只返回“故障”字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM