简体   繁体   English

使用 PATH 对 SQL API 中的数据进行 Azure Cosmos DB 分区和索引

[英]Azure Cosmos DB partitioning and Indexing of Data in SQL API using PATH

I am collecting IoT data to Azure cosmos DB.我正在将 IoT 数据收集到 Azure Cosmos DB。 I know COSOMOS DB SQL API is auto indexed by Path.我知道 COSOMOS DB SQL API 是由路径自动索引的。 I have around 150 sensors in each document, and most of sql queries are of我在每个文档中有大约 150 个传感器,大多数 sql 查询都是

DeviceId is already Partition Key DeviceId 已经是 Partition Key

Select c.sensorVariable From c where c.DeviceId = 'dev1' AND c.time= date1'选择 c.sensorVariable From c where c.DeviceId = 'dev1' AND c.time= date1'

{ "DeviceId" : 'dev1' , "time" : 123333 , "sensor1" : 20 , "sensor2" : 40} { "DeviceId" : 'dev1' , "time" : 123333, "sensor1" : 20, "sensor2" : 40}

I will Fetch the various sensors data but all my queries are depend on depend on deviceId and time( which is in Unix Timestamp )我将获取各种传感器数据,但我所有的查询都取决于 deviceId 和 time(在 Unix Timestamp 中)

Is it possible to index data on deviceID and time and exclude other keys, which are also in the same path / .是否可以在 deviceID 和 time 上索引数据并排除其他键,这些键也在同一路径 / 中。

And for collection by default并默认收集

"includedPaths": [
    {
        "path": "/*",
        "indexes": [
            {
                "kind": "Range",
                "dataType": "Number",
                "precision": -1
            },
            {
                "kind": "Range",
                "dataType": "String",
                "precision": -1
            },
            {
                "kind": "Spatial",
                "dataType": "Point"
            }
        ]
    }
],

It comes with this I feel as for DataType String shouldn't it be having Hash kind indexing rather than Range?随之而来的是,我觉得对于 DataType String 来说,它不应该具有 Hash 类型索引而不是 Range 吗? And what is this Precision : -1这是什么精度:-1

In Azure cosmos DB doc examples I have seen precision as 3 for string, I did not understood why ?在 Azure cosmos DB 文档示例中,我看到字符串的精度为 3,但我不明白为什么?

If I have 100 devices and putting data every second level what type of indexing is better ?如果我有 100 台设备并且每隔一级放置一次数据,哪种类型的索引更好?

Is it possible to index data on deviceID and time and exclude other keys , which are also in the same path是否可以对 deviceID 和 time 上的数据进行索引并排除其他也在同一路径中的键?

Yes.是的。 You could custom your index policy by IncludedPaths and ExcludedPaths .您可以通过IncludedPathsExcludedPaths自定义索引策略。

Such as :如 :

var excluded = new DocumentCollection { Id = "excludedPathCollection" };
excluded.IndexingPolicy.IncludedPaths.Add(new IncludedPath { Path = "/*" });
excluded.IndexingPolicy.ExcludedPaths.Add(new ExcludedPath { Path = "/nonIndexedContent/*" });

await client.CreateDocumentCollectionAsync(UriFactory.CreateDatabaseUri("db"), excluded);

Please refer to more details here .在此处参阅更多详细信息。

what is this Precision : -1这是什么精度:-1

In Azure cosmos DB doc examples I have seen precision as 3 for string, I did not understood why ?在 Azure cosmos DB 文档示例中,我看到字符串的精度为 3,但我不明白为什么?

Based on Index data types, kinds, and precisions :基于索引数据类型、种类和精度

For a Hash index, this varies from 1 to 8 for both strings and numbers.对于哈希索引,对于字符串和数字,这从 1 到 8 不等。 The default is 3. For a Range index, this value can be -1 (maximum precision).默认值为 3。对于范围索引,此值可以是 -1(最大精度)。 It can vary from between 1 and 100 (maximum precision) for string or number values.对于字符串或数字值,它可以在 1 到 100(最大精度)之间变化。

You could focus on this statement to make your choices.你可以专注于这个陈述来做出你的选择。

If i have 100 devices and putting data every second level what type of indexing is better ?如果我有 100 台设备并且每隔一级放置一次数据,哪种类型的索引更好?

It's hard to say which index mode is the best choice.很难说哪种索引模式是最佳选择。 It should be considered with consistency level and your requirements for read and write performance.应该结合一致性级别和您对读写性能的要求来考虑。 You could refer to this paragraph .你可以参考这一段

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM