简体   繁体   English

在 T-SQL 表中查找相似的字符串

[英]Find similar strings in T-SQL table

I'm working on number plate recognition project.我正在从事车牌识别项目。 After number plate is recognized it is stored in MS SQL database.识别车牌后,将其存储在 MS SQL 数据库中。 We have two cameras, and we get two images of one car, from front and from behind.我们有两个摄像头,我们从前面和后面得到一辆车的两张图像。 Sometimes the recognizer recognize the number plate incorrectly from one of image.有时识别器会从一张图像中错误地识别车牌。 For example, the car with number plate 'AA1111' from first camera is recognized as 'AA111' and from second as 'AA1111'.例如,第一个摄像头的车牌号为“AA1111”的汽车被识别为“AA111”,第二个摄像头的车牌被识别为“AA1111”。

In my SQL table I have such records:在我的 SQL 表中,我有这样的记录:

id ID NumberPlate车牌 Confidence信心 CreatedAtTime创建时间
1 1 AA1111 AA1111 100 100 13:44:00 13:44:00
2 2 AA111 AA111 75 75 13:44:10 13:44:10
3 3 BB2222 BB2222 100 100 14:00:00 14:00:00
4 4 AA11 AA11 35 35 13:44:12 13:44:12

From every record in example table we create an incident in a client app.从示例表中的每条记录中,我们在客户端应用程序中创建一个事件。 But this is incorrect because the plates 'AA11' and 'AA111' is the same plate as 'AA1111'.但这是不正确的,因为板“AA11”和“AA111”与“AA1111”是同一个板。

My goal is to create incident in client app only for unique number, in my example it should be: 'AA1111' and 'BB2222'我的目标是仅在客户端应用程序中为唯一编号创建事件,在我的示例中应该是:“AA1111”和“BB2222”

Have you any ideas how to perform this in MS SQL?您对如何在 MS SQL 中执行此操作有任何想法吗?

The database is hosted in Azure SQL Server数据库托管在 Azure SQL 服务器中

UPD: I've write the SQL, but stuck with recursion limitation, may you hav some advice? UPD:我已经写了 SQL,但是遇到了递归限制,您可以给点建议吗?


CREATE TABLE Plates(
  id INT not null,
  plate NVARCHAR(10),
  CreatedAtTime NVARCHAR(20),
  Confidence DECIMAL(18,5)
  )

INSERT INTO Plates
VALUES
(1,'LK2873','13:00:00',100),
(2,'LK287','13:00:10',70),
(3,'LK287','13:00:12',65),
(4,'AZ4875','14:00:00',100),
(5,'TR3345','14:15:32',100),
(6,'TR33','14:15:36',45),
(7,'TR334','14:15:40',70),
(8,'AA76','14:12:36',100),
(9,'DF324','14:13:00',100),
(9,'LK28','13:00:09',64)



;WITH tmp(plate,lvl,snd,ln)  as(
    SELECT Plate,1 lvl,soundex(Plate),LEN(Plate) FROM Plates 
        WHERE LEN(Plate)=(SELECT max(LEN(Plate)) FROM Plates)
    UNION ALL
    SELECT tb.plate, lvl+1,SOUNDEX(tb.Plate),LEN(tb.Plate)
        FROM Plates tb
        INNER JOIn tmp  t ON tb.Plate =LEFT(tb.plate,LEN(t.plate))

)
SELECT * FROM tmp

Think there is a lot of info needed to understand the full requirement.认为有很多信息需要了解完整的要求。 But assuming you do a lookup with the value on the table for anything existing and then insert.但是假设您使用表上的值查找任何现有的值,然后插入。

Maybe something like this:也许是这样的:

and 'AA11' representing the new value coming in. 'AA11' 代表新的值。

WHERE 'AA11' = SUBSTRING(NumberPlate,1,LEN('AA11'))

thus would try and validate against anything with the same pattern up to its own length.因此将尝试验证具有相同模式的任何内容,直至其自身长度。 so if nothing matches, it would be the first.所以如果没有匹配,它将是第一个。 if it matches, there is another with the same or more characters.如果匹配,则另一个具有相同或更多字符。

I'm not sure I understand your question, but if you are looking for any plate that is not the beginning of another plate, you could do it like:我不确定我是否理解你的问题,但如果你正在寻找任何不是另一个盘子开头的盘子,你可以这样做:

select plate from plates p1
where not exists (
  select 1 from plates p2
  where p2.plate like p1.plate + '%'
    and p1.plate <> p2.plate 
);  

or

select plate from plates p1
where not exists (
  select 1 from plates p2
  where CHARINDEX(p2.plate, p1.plate) = 1
    and p1.plate <> p2.plate 
);  

See Fiddle小提琴

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM