简体   繁体   English

json上的sql server全文搜索

[英]sql server full text search on json

i'm in need for effective representation of json in sql server so i can perform a very fast search operation. 我需要sql server中json的有效表示,因此我可以执行非常快速的搜索操作。

What i have: 我有的:

json to be stored : 要存储的json:

{"person": {
  "name": "1234",
  "age": "99",
  "parameters": {
    "param1": "1",
    "param2": "2"
  }
}}

or 要么

{"person": {
  "name": "12345",
  "age": "996",
  "parameters": {
    "param1": "1",
    "param5": "5",
    "param7": "7"
  }
}}

Parameters section can contain up to 20 of 60 different parameters. 参数部分最多可以包含60个不同参数中的20个。 I need to look up for person using only some parameters. 我只需要使用一些参数来查找人。 If some person has 12 i can use 0-12 parameters in search query. 如果某人有12个,我可以在搜索查询中使用0-12个参数。 Name and age are always provided in search query and each person has them both. 姓名和年龄总是在搜索查询中提供,每个人都有。 I have around 30m jsons in table. 我的表中有大约3,000万个json。

Is it possible to do it with sql server without nosql/solr/elastic? 是否可以使用没有nosql / solr / elastic的sql server来做到这一点?

High search performance on sql server can be obtained by adding computed columns by using ADD col1 AS JSON_VALUE(data,'$.person.parameters.param1') and then indexing these computed columns. 通过使用ADD col1 AS JSON_VALUE(data,'$.person.parameters.param1')添加计算列,然后为这些计算列建立索引ADD col1 AS JSON_VALUE(data,'$.person.parameters.param1')可以在sql服务器上获得ADD col1 AS JSON_VALUE(data,'$.person.parameters.param1')搜索性能。

Eg Index JSON data 例如, 索引JSON数据

(Please provide an example on how you json is stored in a sql table, if we need to provide specific code examples.) (如果需要提供特定的代码示例,请提供一个示例,说明如何将json存储在sql表中。)

Let's take your case and create a table, that would store json in a column and all parameters that you want to search for in virtual computed columns: 让我们来看一下您的情况,并创建一个表,该表会将json存储在列中,并将要搜索的所有参数存储在虚拟计算列中:

CREATE TABLE [dbo].[JsonTest](
    [_id] [bigint] IDENTITY(1,1) NOT NULL,
    [Json] [nvarchar](max) NOT NULL,
    [Parameter1]  AS (CONVERT([varchar](20),json_value([Json],'$.person.parameters.param1'))),
    [Parameter2]  AS (CONVERT([varchar](20),json_value([Json],'$.person.parameters.param2'))),
CONSTRAINT [PK_JsonTest] PRIMARY KEY CLUSTERED 
(
    [_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]


ALTER TABLE [dbo].[JsonTest]  WITH CHECK ADD  CONSTRAINT [CK_JsonTest_Json] CHECK  ((isjson([Json])=(1)))

then let's insert two examples you provided: 然后插入您提供的两个示例:

INSERT INTO [JsonTest] (Json) VALUES ('{"person": {
  "name": "1234",
  "age": "99",
  "parameters": {
    "param1": "1",
    "param2": "2"
  }
}}')

INSERT INTO [JsonTest] (Json) VALUES ('{"person": {
  "name": "12345",
  "age": "996",
  "parameters": {
    "param1": "1",
    "param5": "5",
    "param7": "7"
  }
}}')

Now when we query table: 现在,当我们查询表时:

SELECT TOP 100 * FROM [dbo].[JsonTest]

Then we get a result: 然后我们得到一个结果:

在此处输入图片说明

Notice, that computed columns work also when there is no such parameter. 注意,当没有这样的参数时,计算列也可以工作。

Next step is to create indexes on computed columns: 下一步是在计算列上创建索引:

CREATE NONCLUSTERED INDEX [IX_JsonTest_Parameter1] ON [dbo].[JsonTest]
(
    [Parameter1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

CREATE NONCLUSTERED INDEX [IX_JsonTest_Parameter2] ON [dbo].[JsonTest]
(
    [Parameter2] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

And now finally you can query your table very fast: 现在,您终于可以非常快速地查询表了:

SELECT [Json], Parameter1, Parameter2
FROM [dbo].[JsonTest]
WITH (INDEX(IX_JsonTest_Parameter1),INDEX(IX_JsonTest_Parameter2))
WHERE Parameter1 = 1 and Parameter2 = 2

This approach queries a table with 2m records in less than 1 second. 此方法在不到1秒的时间内查询具有2m条记录的表。 And have in mind that we only keep json values. 请记住,我们只保留json值。

If you want to play with full-text search, you would have to enable it first. 如果要使用全文本搜索,则必须先启用它。 Everything is described here: Cannot use a CONTAINS or FREETEXT predicate on table or indexed view because it is not full-text indexed 此处描述了所有内容: 无法在表或索引视图上使用CONTAINS或FREETEXT谓词,因为它不是全文索引

And now query can look similar to this: 现在查询看起来类似于:

SELECT [Json]
FROM JsonTest
Where Contains(Json,'Near((param1,1), MAX, True)')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM