[英]What is a fastest way to get words from T-SQL datatable?
我有一个SQL Server 2008 R2 dbo.Forum_Posts
表dbo.Forum_Posts
,其dbo.Forum_Posts
Subject (nvarchar(255))
和Body (nvarchar(max))
。
我想获得的所有单词与列长度> = 3 Subject
和Body
并将其插入到数据表dbo.Search_Word (column Word, nvarchar(100))
和DataTable dbo.SearchItem (column Title (nvarchar(200))
我还希望得到新生成的SearchWordsID (primary key, autoincrement, int)
从dbo.Search_Word
和SearchItemID (primary key, autoincrement,int)
从dbo.SearchItem
,并将其插入到数据表dbo.SearchItemWord (columns SearchWordsID (foreign key,int, not null)
和SearchItemID (foreign key,int,not null)
。
在T-SQL中最快的方法是什么? 还是我必须使用C#? 预先感谢您的任何帮助。
根据要求,这将保留ID。 因此,您将获得一个按ID分配的DISTINCT工作清单。
与第一个答案略有不同,但可通过外部申请轻松实现
**
您必须编辑初始查询。从[YourTable]中选择KeyID = [YourKeyID],Words = [YourField1] +''+ [YourField2]
**
Declare @String varchar(max) = ''
Declare @Delimeter varchar(25) = ' '
-- Generate and Strip special characters
Declare @StripChar table (Chr varchar(10));Insert Into @StripChar values ('.'),(','),('/'),('('),(')'),(':') -- Add/Remove as needed
-- Generate Base Data and Expand via Outer Apply
Declare @XML xml
Set @XML = (
Select A.KeyID
,B.Word
From ( Select KeyID=[YourKeyID],Words=[YourField1]+' '+[YourField2] from [YourTable]) A
Outer Apply (
Select Word=split.a.value('.', 'varchar(150)')
From (Select Cast ('<x>' + Replace(A.Words, @Delimeter, '</x><x>')+ '</x>' AS XML) AS Data) AS A
Cross Apply data.nodes ('/x') AS Split(a)
) B
For XML RAW)
-- Convert XML to varchar(max) for Global Search & Replace (could be promoted to Outer Appy)
Select @String = Replace(Replace(cast(@XML as varchar(max)),Chr,' '),' ',' ') From @StripChar
Select @XML = cast(@String as XML)
Select Distinct
KeyID = t.col.value('@KeyID', 'int')
,Word = t.col.value('@Word', 'varchar(150)')
From @XML.nodes('/row') AS t (col)
Where Len(t.col.value('@Word', 'varchar(150)'))>3
Order By 1
退货
KetID Word
0 UNDEF
0 Undefined
1 HIER
1 System
2 Control
2 UNDEF
3 JOBCONTROL
3 Market
3 Performance
...
87 Analyitics
87 Market
87 UNDEF
88 Branches
88 FDIC
88 UNDEF
...
您将需要T-SQL插入表中。 您面临的最大挑战是将帖子拆分成文字。
我的建议是将帖子阅读为C#,将每个帖子拆分为单词(您可以使用Split
方法拆分空格或标点符号),过滤单词集合,然后从C#执行插入。
如果使用Entity Framework
或类似的ORM,则可以避免直接使用T-SQL。
除非您真的想要一个完全的SQL解决方案并且愿意花时间完善它,否则不要尝试使用T-SQL将您的帖子分成单词。 而且,是的,它会很慢:T-SQL在字符串操作上并不快。
您还可以研究全文索引,我相信它可以支持搜索关键字。
也许这会有所帮助
Declare @String varchar(max) = ''
Declare @Delimeter varchar(25) = ' '
Select @String = @String + ' '+Words
From (
Select Words=[YourField1]+' '+[YourField2] from [YourTable]
) A
-- Generate and Strip special characters
Declare @StripChar table (Chr varchar(10));Insert Into @StripChar values ('.'),(','),('/'),('('),(')'),(':') -- Add/Remove as needed
Select @String = Replace(Replace(@String,Chr,' '),' ',' ') From @StripChar
-- Convert String into XML and Split Delimited String
Declare @Table Table (RowNr int Identity(1,1), String varchar(100))
Declare @XML xml = Cast('<x>' + Replace(@String,@Delimeter,'</x><x>')+'</x>' as XML)
Insert Into @Table Select String.value('.', 'varchar(max)') From @XML.nodes('x') as T(String)
-- Generate Final Resuls
Select Distinct String
From @Table
Where Len(String)>3
Order By 1
退货(样品)
String
------------------
Access
Active
Adminstrators
Alternate
Analyitics
Applications
Branches
Cappelletti
City
Class
Code
Comments
Contact
Control
Daily
Data
Date
Definition
Deleted
Down
Email
FDIC
Variables
Weekly
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.