简体   繁体   English

从 R 脚本到 Power Bi - 如何使用 setdiff

[英]From R script to Power Bi - how to use setdiff

I've got two data frames: zerowy_nazwa5 , zatwierdzony_nazwa5 ,我有两个数据框: zerowy_nazwa5zatwierdzony_nazwa5

and working 2 lines:和工作 2 行:

setdiff(zatwierdzony_nazwa5, zerowy_nazwa5)
setdiff(zerowy_nazwa5, zatwierdzony_nazwa5)

how I implement this in PowerBi?我如何在 PowerBi 中实现它?

Thanks for help感谢帮助

Your question is rather unclear, so I'm going to have to make some assumptions.你的问题很不清楚,所以我将不得不做出一些假设。 I will interpret your question as how to natively perform a set difference in Power BI.我将您的问题解释为如何在 Power BI 中本机执行一组差异。


Suppose we have tables A and B as follows假设我们有如下表AB

Table A:   Table B:      
Column     Column
------     ------
 1          2
 2          4
 3
 4
 5

and we want to get the set difference A - B我们想要得到集合差A - B

 Column
 ------
  1
  3
  5

You can do it in DAX or in the Power Query M language:您可以在 DAX 或 Power Query M 语言中执行此操作:


M language M语言

You can do this using a left anti join .您可以使用left anti join来做到这一点。 The M code looks like this: M 代码如下所示:

 = Table.NestedJoin(A,{"Column"},B,{"Column"},"B",JoinKind.LeftAnti)

Delete the new "B" column and you're good to go.删除新的“B”列,您就可以开始了。

Another way is to use the Table.SelectRows function:另一种方法是使用Table.SelectRows函数:

= Table.SelectRows(A, each not List.Contains(B[Column], _[Column]))

DAX language DAX语言

You just need to filter table A to exclude values in table B :您只需要过滤表A以排除表B值:

FILTER(A, NOT( A[Column] IN VALUES( B[Column] ) ) )

Or using the older CONTAINS syntax instead of IN :或者使用旧的CONTAINS语法而不是IN

FILTER(A, NOT( CONTAINS( VALUES( B[Column] ), B[Column], A[Column] ) ) )

Note: It certainly is possible to use R scripts within the Power Query environment, as vestland points out.注意:正如vestland 指出的那样,当然可以在Power Query 环境中使用R 脚本。 It is not currently possible to use R scripts within a DAX expression, as Juan points out.正如 Juan 指出的那样,目前无法在 DAX 表达式中使用 R 脚本。

Reading your question, I'm assuming this:阅读您的问题,我假设:

  1. Your main goal is to do this internally in PowerBI您的主要目标是在 PowerBI 内部执行此操作
  2. You're not specifically asking how to do it using DAX您并没有特别询问如何使用 DAX

The Power of R in Power BI is not limited to R Visuals. Power BI 中 R 的力量不仅限于 R Visuals。 You can load both single and multiple tables and use them as input to R scripts and any R functionality using Edit Queries > Transform > Run R Script .您可以使用Edit Queries > Transform > Run R Script加载单个和多个表,并将它们用作 R 脚本和任何R 功能的输入。

Here's an example using two synthetic dataframes and setdiff():这是一个使用两个合成数据帧和 setdiff() 的示例:

Snippet 1 (from the dplyr::setdiff examples in R)片段 1 (来自 R 中的 dplyr::setdiff 示例)

library(dplyr)
a <- data.frame(column = c(1:10, 10))
b <- data.frame(column = c(1:5, 5))
c <- dplyr::setdiff(a, b)

# Output
# column
# 1      6
# 2      7
# 3      8
# 4      9
# 5     10

Since you didn't describe your expected output, I'm assuming this is what you were after.由于您没有描述您的预期输出,我假设这就是您所追求的。 But beware that if you're not using the dplyr library, base::setdiff() will give a different output:但请注意,如果您不使用dplyr库,则 base::setdiff() 将给出不同的输出:

Snippet 2片段 2

c <- base::setdiff(a, b)

# output

# column
# 1       1
# 2       2
# 3       3
# 4       4
# 5       5
# 6       6
# 7       7
# 8       8
# 9       9
# 10     10

And if you carefully follow the steps in this post you will be a able to end up with this in Power BI.如果你仔细地遵循的步骤这篇文章,你是一个能够与该电力BI结束了。 But here's the essence of it: To reproduce the example, go to Edit Queries (Power Query Editor) > Enter Data and click OK .但它的本质是:要重现该示例,请转到“ Edit Queries (Power Query Editor) > Enter Data ,然后单击“ OK Then insert an R script using Transform > Run R script and insert the snippet above.然后使用Transform > Run R script插入一个 R 脚本并插入上面的代码片段。

在此处输入图片说明

If anything is unclear, or if you're not able to reproduce the result, let me know.如果有任何不清楚的地方,或者您无法重现结果,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM