I'm using SQL Server (T-SQL). I have a column with the row values A, AB, ABC, AC. I want to remove any values that are contained with another row. In this case I'd be left with ABC and AC since A and AB and contained in the other two.
My thought is to take each value of the column and use LIKE to search through the whole column and count the number of results returned If it is equal to 1 then it is not contained in other rows.
Is that a good way to do it? I ask because I'm reluctant to use loops/cursors.
Thanks
Here is a code sample taken from the explanation above:
CREATE TABLE #t (words varchar(10))
INSERT INTO #t
VALUES ('A'),('AB'),('ABC'),('AC')
Using cursors, I think I'd do something like:
DECLARE @branches TABLE (words varchar(10), n int)
DECLARE @word VARCHAR(10)
DECLARE cursor_word CURSOR
FOR SELECT words FROM #t
OPEN cursor_word;
FETCH NEXT FROM cursor_word INTO @word
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO @branches SELECT @word, COUNT(*) FROM #t WHERE words like CONCAT('%', @word ,'%')
FETCH NEXT FROM cursor_word INTO @word
END
CLOSE cursor_word
DEALLOCATE cursor_word
SELECT * FROM @branches WHERE n = 1
You could try something like
SELECT *
FROM (
SELECT *
, Row_Number() OVER(ORDER BY Words) N -- Create identifier for the row
FROM #t
) t1
LEFT JOIN (
SELECT *
, Row_Number() OVER(ORDER BY Words) N -- Create identifier for the row
FROM #t
) t2 on t1.N <> t2.n -- Where the identifier is different
AND t2.Words LIKE t1.Words + '%' -- Where t2.Words starts with t1.Words
WHERE t2.Words IS NULL -- And there is no match of t2.
I would juste use not exists
for this. This requires having a primary key in your table (which is a must-have anyway), so let me assume id
:
select t.*
from mytable t
where not exists (
select 1
from mytable t1
where t1.id <> t.id and t1.word like '%' + t.word + '%'
)
I would use not exists
, but no primary key is necessary:
select t.*
from t
where not exists (select 1
from t t2
where t2.words like '%' + t.words + '%' and
t2.words <> t.words
);
Here is a db<>fiddle.
The method that you describe is:
select t.*
from t
where (select count(*)
from t t2
where t2.words like '%' + t.words + '%'
) = 1;
If you have no duplicates, this is functionally equivalent to the not exists
version. However, not exists
is much better. Why? The aggregation version has to go through every row to calculate the count. The not exists
version can stop at the first match -- which can significantly reduce the number of like
comparisons.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.