简体   繁体   English

2个字段上的交叉过滤器尺寸

[英]crossfilter dimension on 2 fields

My data looks like this 我的数据看起来像这样

field1,field2,value1,value2
a,b,1,1
b,a,2,2
c,a,3,5
b,c,6,7
d,a,6,7

I don't have a good way of rearranging that data so let's assume the data has to stay like this. 我没有很好的方法来重新排列数据,所以让我们假设数据必须保持这样。

I want to create a dimension on field1 and field2 combined : a single dimension that would take the union of all values in both field1 and field2 (in my example, the values should be [a,b,c,d] ) 我想在field1field2组合上创建一个维:一个维,它将采用field1field2中所有值的并集(在我的示例中,值应为[a,b,c,d]

As a reduce function you can assume reduceSum on value2 for example (allowing double counting for now). 作为reduce函数,您可以假设对value2使用reduceSum (例如,现在允许重复计数)。

(have tagged dc.js and reductio because it could be useful for users of those libraries) (已标记dc.js和Reductio,因为它可能对那些库的用户有用)

First I need to point out that your data is denormalized, so the counts you get might be somewhat confusing, no matter what technique you use. 首先,我需要指出的是,您的数据是非规范化的,因此,无论使用哪种技术,您获得的计数都可能会有些混乱。

In standard usage of crossfilter, each row will be counted in exactly one bin, and all the bins in a group will add up to 100%. 在标准使用交叉过滤器的情况下,每一行都将精确地计入一个bin中,并且一组中的所有bin总计将达到100%。 However, in your case, each row will be counted twice (unless the two fields are the same), so for example a pie chart wouldn't make any sense. 但是,在您的情况下,每行将被计数两次(除非两个字段相同),例如,饼图将毫无意义。

That said, the "tag dimension" feature is perfect for what you're trying to do. 也就是说, “标签尺寸”功能非常适合您要尝试执行的操作。

The dimension declaration could be as simple as: 尺寸声明可以很简单:

var tagDimension = cf.dimension(function(d) { return [d.field1,d.field2]; }, true);

Now each row will get counted twice - this dimension and its associated groups will act exactly as if each of the rows were duplicated, with one copy indexed by field1 and the other by field2 . 现在,每一行都将被计数两次-此维及其关联的组将像每行都被重复一样准确地工作,一个副本由field1索引,另一个副本由field2索引。

If you made a bar chart with this, say, the total count will be 2N minus the number of rows where field1 === field2 . 举例来说,如果您制作了条形图,则总数为2N减去field1 === field2的行数。 If you click on bar 'b', all rows which have 'b' in either fields will get selected. 如果单击栏“ b”,则在两个字段中都具有“ b”的所有行都将被选中。 This only affects groups built on this dimension, so any other charts will only see one copy of each row. 这仅影响基于此维度的组,因此任何其他图表将仅看到每一行的一个副本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM