简体   繁体   English

如何在Azure Cosmos DB中“合并”或“转换” JSON文档

[英]How to “merge” or “transform” JSON documents in Azure Cosmos DB

I'm setting up a Chatbot with the Microsoft Bot Framework and Azure. 我正在使用Microsoft Bot Framework和Azure设置Chatbot。 I want to save my "UserState" in a database in order to easily analyze the user data. 我想将“ UserState”保存在数据库中,以便轻松分析用户数据。 I managed to save my userState in form of JSON documents in Azure Cosmos DB. 我设法将我的userState以JSON文档的形式保存在Azure Cosmos DB中。

The problem is that each interaction with the bot creates a new "document" in a "collection" in Cosmos DB. 问题在于,与机器人的每次交互都会在Cosmos DB的“集合”中创建一个新的“文档”。

How can I easily merge the data (data structure is consistent) and in the best case have the data in some kind of table? 我如何轻松地合并数据(数据结构是一致的),并在最佳情况下将数据存储在某种表中? The tool I want to use for analyzing requires .txt or .csv files. 我要用于分析的工具需要.txt或.csv文件。

在此处输入图片说明

This is a snippet of the JSON file which stores the user data. 这是存储用户数据的JSON文件的片段。

{
    "id": "emulator*2fusers*2f9321b527-4699-4b4a-8d9d-9cd9fa8f1967*2f",
    "realId": "emulator/users/9321b527-4699-4b4a-8d9d-9cd9fa8f1967/",
    "document": {
        "userData": {
            "name": "value",
            "age": 18,
            "gender": "value",
            "education": "value",
            "major": "value"
        },
        "userDataExtended": {
            "roundCounter": 3,
            "choices": [
                "A",
                "A",
                "B"
            ],
        },
    "_rid": "0k5YAPBrVaknAAAAAAAAAA==",
    "_self": "dbs/0k5YAA==/colls/0k5YAPBrVak=/docs/0k5YAPBrVaknAAAAAAAAAA==/",
    "_etag": "\"ac009377-0000-0000-0000-5c59c5610000\"",
    "_attachments": "attachments/",
    "_ts": 1549387105
}

In the best case I want to have the data in a table structure with columns "name", "age", etc. and each user (document) as a row. 在最好的情况下,我希望将数据包含在表结构中,并将“名称”,“年龄”等列以及每个用户(文档)作为一行。

Thank you! 谢谢!

There's a few things in your questions and I'll address them all separately. 您的问题中有几件事,我将分别解决。

Expanding on Drew's comment: 扩展Drew的评论:

You have multiple documents being created because you're running the bot through emulator. 您正在创建多个文档,因为您正在通过模拟器运行机器人。 Each time emulator restarts, it creates a new User ID and therefore a new document for the user and also one for that user's conversation. 模拟器每次重新启动时,都会创建一个新的用户ID,从而为该用户创建一个新文档,并为该用户的会话创建一个新文档。 You will not have this issue if you use a channel other than emulator, provided that the User ID remains consistent. 如果您使用仿真器以外的其他渠道,则只要用户ID保持一致,就不会出现此问题。

Regarding merging documents: 关于合并文件:

I'm not sure exactly what you're looking for, but you might be able to use SQL Queries to accomplish what you need. 我不确定您要查找的是什么,但是您可以使用SQL查询来完成所需的操作。 Just click "New SQL Query". 只需单击“新建SQL查询”。 For example, running SELECT * FROM c merges all of the documents into a single output. 例如,运行SELECT * FROM c将所有文档合并到一个输出中。

在此处输入图片说明

Regarding text/csv files: 关于text / csv文件:

I'm not sure what your tool is, but if it can handle JSON, then the above might work for you. 我不确定您的工具是什么,但是如果它可以处理JSON,那么上面的方法可能对您有用。 If not, you can implement custom middleware to get the txt/csv output you're looking for. 如果没有,则可以实现自定义中间件来获取所需的txt / csv输出。 Here's a sample that shows something relatively similar. 这是一个显示相对相似的示例 There isn't an equivalent example in C#, but you can still implement your own middleware to do the same thing. C#中没有等效的示例,但是您仍然可以实现自己的中间件来执行相同的操作。

Regarding Tables: 关于表:

If you're really looking for Table Storage, it was supported in V3 bots, but replaced by blob storage in V4. 如果您确实在寻找表存储, 则V3机器人支持该表存储,但在V4中已由Blob存储代替。 You could write your bot in V3. 您可以在V3中编写您的机器人。 Similar to what Jay said, you might still be able to use a trigger function to send it to table storage, but then you're storing the data twice. 与Jay所说的类似,您仍然可以使用触发函数将其发送到表存储,但是随后您将数据存储了两次。

Regarding Analysis 关于分析

If all you're really looking for is analysis, Application Insights/Bot Analytics may be what you need, although I don't believe it will provide the detail you're looking for. 如果您真正想要的只是分析,那么Application Insights / Bot Analytics可能就是您所需要的,尽管我认为它不会提供您所需要的详细信息。

In the best case I want to have the data in a table structure with columns "name", "age", etc. and each user (document) as a row. 在最好的情况下,我希望将数据包含在表结构中,并将“名称”,“年龄”等列以及每个用户(文档)作为一行。

Obviously,you need to use some other services to implement this requirement because the data which is collected by bot service already exists. 显然,您需要使用其他一些服务来实现此要求,因为由bot服务收集的数据已经存在。

In my opinion, maybe the cosmos db trigger azure function is a good option for you. 在我看来,也许cosmos db触发azure函数对您来说是一个不错的选择。 The function will be triggered when any updates inflow into your cosmos db collection. 当任何更新流入您的cosmos db集合时,将触发该功能。

Of course you could get more explanations from this link ,then what I want to say is that you could configure the Cosmos db as input binding and Azure Blob Storage as output binding (maybe a specific csv file). 当然,您可以从此链接获得更多说明,然后我想说的是,可以将Cosmos db配置为输入绑定 ,将Azure Blob存储配置为输出绑定 (也许是特定的csv文件)。 In the function,you could get your desired columns with cosmos db sdk and assemble them into any format you want. 在该函数中,您可以使用cosmos db sdk获得所需的列,并将其组合为所需的任何格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM