简体   繁体   English

聚合多个一致的数据库以供 Power BI 使用的最佳做法是什么

[英]What is best practice for aggregating multiple congruent databases for use by Power BI

I am consolidating data from multiple, identically-structured databases that make frequent use of bigint key fields.我正在整合来自多个结构相同的数据库的数据,这些数据库经常使用 bigint 键字段。 What is best practice for ensuring uniqueness in the aggregate tables and ensuring they can still be related to the foreign keys in other aggregate tables once they're in Power BI ?确保聚合表的唯一性并确保它们在 Power BI 中后仍然可以与其他聚合表中的外键相关的最佳做法是什么?

I ask because it is my understanding that Power BI won't allow joins using multiple columns.我问是因为据我了解 Power BI 不允许使用多列进行联接。

I have created the following illustrative case:我创建了以下说明性案例:

源数据库

If Power BI were okay with me joining the aggregate Customers table to the aggregate Orders table using multiple fields, I'd simply add a source field (eg src) and do this:如果 Power BI 同意我使用多个字段将聚合 Customers 表加入聚合 Orders 表,我只需添加一个源字段(例如 src)并执行以下操作: 使用新字段“src”聚合数据库 Note that the join between the two tables uses two fields: src and CustId请注意,两个表之间的连接使用了两个字段:src 和 CustId

But if, as I understand it, Power BI requires that those be joined by a single field, I'd be tempted to create a new value by merging the src and CustId fields into, say, SrcCustId and joining on that:但是,如果按照我的理解,Power BI 要求将它们连接到一个字段,我很想通过将srcCustId字段合并到SrcCustId 中并连接到它来创建一个新值: 与新的辅助字段聚合

Finally, if the answer is merging the two columns into a helper column, can I do that using a computed column in SQL Server (or SQL Database) or do I need to handle that when loading the source tables in the first place?最后,如果答案是将两列合并到辅助列中,我可以使用 SQL 服务器(或 SQL 数据库)中的计算列来执行此操作,还是我需要在首先加载源表时处理它?

I would prefer the computed column solution because there may be multiple foreign keys in my actual tables and loading helper columns for all of them will blow up the amount of work I need to do each time I spin up new Azure Data Factory pipelines for a new source database.我更喜欢计算列解决方案,因为我的实际表中可能有多个外键,并且每次为新的 Azure 数据工厂管道启动新的 Azure 数据工厂管道时,为所有外键加载辅助列都会增加我需要做的工作量源数据库。

Explanation: - Instead of merging or creating helper column you can use CombineValues function of DAX.说明:-您可以使用 DAX 的 CombineValues function 而不是合并或创建帮助列。 The CombineValues function supports multi-column relationships in DirectQuery models. CombineValues function 支持 DirectQuery 模型中的多列关系。

Reference: - COMBINEVALUES function (DAX) - DAX |参考:- COMBINEVALUES function (DAX) - DAX | Microsoft Learn 微软学习

Other option to use user defined aggregations.使用用户定义聚合的其他选项。 Aggregations in PowerBI is used to improve the performance of large DirectQuery datasets. PowerBI 中的聚合用于提高大型 DirectQuery 数据集的性能。

Power BI desktop app has provision of “Manage Aggregations” to create aggregation based on tables. Power BI 桌面应用程序提供了“管理聚合”以基于表创建聚合。

Reference: - https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced参考:- https://learn.microsoft.com/en-us/power-bi/transform-model/aggregations-advanced

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM