[英]Extracting array properties from Cosmos DB documents using Azure Data Factory
I have an Azure Data Factory v2 pipeline that's pulling data from a Cosmos DB collection. 我有一个Azure Data Factory v2管道,该管道从Cosmos DB集合中提取数据。 This collection has a property that's an array. 这个集合的属性是一个数组。
I want to, at the least, be able to dump that entire property's value into a column in SQL Azure. 我至少希望能够将整个属性的值转储到SQL Azure中的列中。 I don't need it parsed (although that would be great too), but ADF lists this column as "Unsupported Type" in the dataset definition and listed it in the Excluded Columns section. 我不需要解析它(尽管那也很棒),但是ADF在数据集定义中将此列列为“不受支持的类型”,并在“排除的列”部分列出了它。
Here is an example of the JSON I'm working with. 这是我正在使用的JSON的示例。 The property I want is "MyArrayProperty": 我想要的属性是“ MyArrayProperty”:
{
"id": "c4e2012e-af82-4c48-8960-11e0436e6d3f",
"Created": "2019-06-14T16:04:13.9572567Z",
"Updated": "2019-06-14T16:04:14.1920988Z",
"IsActive": true,
"MyArrayProperty": [
{
"SomeId": "a4427015-ca69-4958-90d3-0918fd5dcac1",
"SomeName": "BlahBlah"
}
]
}
}
I've tried manually specifying a column in the ADF data source like "MyArrayProperty" and using a string data type, but the value always comes across as null. 我尝试手动在ADF数据源(例如“ MyArrayProperty”)中指定一列,并使用字符串数据类型,但该值始终为null。
There may be a better way to solve this problem, but I ended up creating a second copy activity which uses a query against Cosmos rather than a collection based capture. 解决这个问题可能有更好的方法,但是我最终创建了第二个复制活动,该活动使用针对Cosmos的查询而不是基于集合的捕获。 The query flattened the array like so: 查询将数组展平,如下所示:
SELECT m.id, c.SomeId, c.SomeName
FROM myCollection m join c in m.MyArrayProperty
I then took this data set and dumped it into a table in SQL then did my other work inside SQL Azure itself. 然后,我获取了此数据集并将其转储到SQL中的表中,然后在SQL Azure本身中完成了其他工作。 You could also use the new Join pipeline task to do this in memory before it gets to the destination. 您还可以使用新的Join管道任务在到达目的地之前在内存中执行此操作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.