简体   繁体   English

dc.js / crossfilter中的百分比变化

[英]Percentage Change in dc.js/crossfilter

I just started with dc.js and was looking at the NASDAQ example on the main site: https://dc-js.github.io/dc.js/ 我刚开始使用dc.js,并在主站点上查看纳斯达克示例: https ://dc-js.github.io/dc.js/

I created a Fiddle with some sample dummy data and just the two relevant charts for this question. 我用一些示例虚拟数据和这个问题的两个相关图表创建了一个提琴

Similar to the NASDAQ example, I want to have a bubble chart with the Y-Axis being the % Change in value over a timespan controlled by a brush in a different chart. 类似于纳斯达克的例子,我想要一个气泡图,Y轴是在不同图表中由画笔控制的时间跨度内值的%变化。 The code for the NASDAQ example does the following: 纳斯达克示例的代码执行以下操作:

    var yearlyPerformanceGroup = yearlyDimension.group().reduce(
    /* callback for when data is added to the current filter results */
    function (p, v) {
        ++p.count;
        p.absGain += v.close - v.open;
        p.fluctuation += Math.abs(v.close - v.open);
        p.sumIndex += (v.open + v.close) / 2;
        p.avgIndex = p.sumIndex / p.count;
        p.percentageGain = p.avgIndex ? (p.absGain / p.avgIndex) * 100 : 0;
        p.fluctuationPercentage = p.avgIndex ? (p.fluctuation / p.avgIndex) * 100 : 0;
        return p;
    },
    /* callback for when data is removed from the current filter results */
    function (p, v) {
        --p.count;
        p.absGain -= v.close - v.open;
        p.fluctuation -= Math.abs(v.close - v.open);
        p.sumIndex -= (v.open + v.close) / 2;
        p.avgIndex = p.count ? p.sumIndex / p.count : 0;
        p.percentageGain = p.avgIndex ? (p.absGain / p.avgIndex) * 100 : 0;
        p.fluctuationPercentage = p.avgIndex ? (p.fluctuation / p.avgIndex) * 100 : 0;
        return p;
    },
    /* initialize p */
    function () {
        return {
            count: 0,
            absGain: 0,
            fluctuation: 0,
            fluctuationPercentage: 0,
            sumIndex: 0,
            avgIndex: 0,
            percentageGain: 0
        };
    }
);

which I currently interpret as summing(close-open) across all data and dividing by the average of the average daily index. 我目前将其解释为所有数据的求和(封闭)除以平均每日指数的平均值。 But this is not a percent change formula I am familiar with. 但这不是我熟悉的百分比变化公式。 (eg (new-old)/old x 100) (例如(新旧)/旧x 100)

While it seems to work for the NASDAQ example, my data would be more like the following: 尽管对于纳斯达克示例似乎可行,但我的数据更像是以下内容:

country_id,received_week,product_type,my_quantity,my_revenue,country_other_quantity
3,2017-04-02,1,1,361,93881
1,2017-04-02,4,45,140,93881
2,2017-04-02,4,2,30,93881
3,2017-04-02,3,1,462,93881
2,2017-04-02,3,48,497,93881

etc.. over many months and product_types. 等等。过去几个月和product_types。

Let's say I was interested in computing the percent change for a particular Country . 假设我对计算特定国家/地区的变化百分比感兴趣。 How do I get the start and end quantities for a given country so I can compute change as end-start/start * 100? 如何获得给定国家/地区的开始和结束数量,以便可以将变化计算为开始/开始数量* 100?

I was thinking of something such as the following (assuming I set up the proper dimensions and everything) 我正在考虑以下内容(假设我设置了适当的尺寸和所有内容)

var country_dim = ndx.dimension(function (d) { return d['country_id']; })
var received_day_dim = ndx.dimension(function (d) { return d['received_day']; })
var date_min = received_day_dim.bottom(1)[0]['received_day']
var date_max = received_day_dim.top(1)[0]['received_day']

Then in my custom reduce function currently in the vein of the example (wrong): 然后在我的自定义的reduce函数中,该示例当前是错误的:

var statsByCountry = country_dim.group().reduce(
          function (p, v) {
              ++p.count;
              p.units += +v["my_units"];
              p.example_rate = +v['my_units']/(v['quantity_unpacked']*90) //place holder for total units per day per country
              p.sumRate +=  p.opp_buy_rate;
              p.avgRate = p.opp_buy_rate/p.count;
              p.percentageGain = p.avgRate ? (p.opp_buy_rate / p.avgRate) * 100 : 0;
              p.dollars += +v["quantity_unpacked"]/2;
              // p.max_date = v['received_week'].max();
              // p.min_date
              //dateDimension.top(Infinity)[dateDimension.top(Infinity).length - 1]['distance'] - dateDimension.top(Infinity)[0]['distance']


              return p;
          },
          function (p, v) {
              --p.count;
              if (v.region_id > 2) {
                p.test -= 100;
              }
              p.units -= +v["quantity_unpacked"];
              p.opp_buy_rate = +v['quantity_unpacked']/(v['quantity_unpacked']*90) //place holder for total units per day per country
              p.sumRate -=  p.opp_buy_rate;
              p.avgRate = p.count ? p.opp_buy_rate/p.count : 0;
              p.percentageGain = p.avgRate ? (p.opp_buy_rate / p.avgRate) * 100 : 0;
              p.dollars -= +v["quantity_unpacked"]/2;
              // p.max_date = v['received_week'].max();
              return p;
          },
          function () {
              return {quantity_unpacked: 0,
                      count: 0,
                      units: 0,
                      opp_buy_rate: 0,
                      sumRate: 0,
                      avgRate: 0,
                      percentageGain: 0,
                      dollars: 0,
                      test: 0
              };//, dollars: 0}
          }
  );

and my chart: 和我的图表:

country_bubble
    .width(990)
    .height(250)
    .margins({top:10, right: 50, bottom: 30, left:80})
    .dimension(country_dim)
    .group(statsByCountry)
    .keyAccessor(function (p) {
      return p.value.units;
    })
    .valueAccessor(function (p) { //y alue
      return p.value.percentageGain;
    })
    .radiusValueAccessor(function (p) { //radius
        return p.value.dollars/10000000;
    })
    .maxBubbleRelativeSize(0.05)
    .elasticX(true)
    .elasticY(true)        
    .elasticRadius(true)
    .x(d3.scale.linear())
    .y(d3.scale.linear())
    // .x(d3.scale.linear().domain([0, 1.2*bubble_xmax]))
    // .y(d3.scale.linear().domain([0, 10000000]))
    .r(d3.scale.linear().domain([0, 10]))
    .yAxisPadding('25%')
    .xAxisPadding('15%')
    .renderHorizontalGridLines(true)

    .renderVerticalGridLines(true)        

    .on('renderlet', function(chart, filter){
    chart.svg().select(".chart-body").attr("clip-path",null);
 });

Originally thought of having something similar to the following in statsbycountry: 最初认为在statsbycountry中具有类似于以下内容的东西:

          if (v.received_day == date_min) {
            p.start_value += v.my_quantity;
          }
          if (v.received_day == date_max) {
            p.end_value += v.my_quantity;
          }

This seems a bit clumsy? 这看起来有点笨拙? But if I do this, I don't think this will continually update as other filters change (say time or product)? 但是,如果我这样做,我认为这不会随着其他过滤器的更改(例如时间或产品)而不断更新吗? Ethan suggested I use fake groups, but I'm a bit lost. 伊桑(Ethan)建议我使用假组,但我有点迷路。

With the working fiddle, we can demonstrate one way to do this. 通过工作的小提琴,我们可以演示一种实现此目的的方法。 I don't really think this is the best way to go about it, but it is the Crossfilter way. 我真的不认为这是最好的解决方法,但它是Crossfilter方法。

First you need to maintain an ordered array of all data in a group as part of the group using your custom reduce function: 首先,您需要使用自定义的reduce函数维护组中所有数据的有序数组,作为该组的一部分:

var statsByCountry = country_dim.group().reduce(
  function(p, v) {
    ++p.count;
    p.units += +v["my_quantity"];
    p.country_rate = p.units / (1.0 * v['country_other_quantity']) //hopefully total sum of my_quantity divided by the fixed country_other_quantity for that week
    p.percent_change = 50 //placeholder for now, ideally this would be the change in units over the timespan brush on the bottom chart
    p.dollars += +v["my_revenue"];

    i = bisect(p.data, v, 0, p.data.length);
    p.data.splice(i, 0, v);
    return p;
  },
  function(p, v) {
    --p.count;
    p.units -= +v["my_quantity"];
    p.country_rate = p.units / (1.0 * v['country_other_quantity']) //hopefully total sum of my_quantity divided by the fixed country_other_quantity for that week
    p.percent_change = 50 //placeholder for now, ideally this would be the change in units over the timespan brush on the bottom chart
    p.dollars -= +v["my_revenue"];

    i = bisect(p.data, v, 0, p.data.length);
    p.data.splice(i, 1)
    return p;
  },
  function() {
    return {
      data: [],
      count: 0,
      units: 0,
      country_rate: 0,
      dollars: 0,
      percent_change: 0
    }; //, dollars: 0}
  }
);

Above, I've updated your reduce function to maintain this ordered array (ordered by received_week ) under the .data property. 以上,我已经更新您减少功能来保持这种有序阵列(由命令received_week下) .data属性。 It uses Crossfilter's bisect function to maintain order efficiently. 它使用Crossfilter的bisect函数来有效维护订单。

Then in your valueAccessor you want to actually calculate your change in value based on this data: 然后,您要在valueAccessor中根据此数据实际计算出值的变化:

  .valueAccessor(function(p) { //y alue
    // Calculate change in units/day from first day to last day.
    var firstDay = p.value.data[p.value.data.length-1].received_week.toString();
    var lastDay = p.value.data[0].received_week.toString();
    var firstDayUnits = d3.sum(p.value.data, function(d) { return d.received_week.toString() === firstDay ? d.my_quantity : 0 })
    var lastDayUnits = d3.sum(p.value.data, function(d) { return d.received_week.toString() === lastDay ? d.my_quantity : 0 })
    return lastDayUnits - firstDayUnits;
  })

You do this in the value accessor because it only runs once per filter change, whereas the reduce functions run once per record added or removed, which can be thousands of times per filter. 您可以在值访问器中执行此操作,因为它每次更改过滤器仅运行一次,而reduce函数对添加或删除的每个记录运行一次,每个记录可能运行数千次。

If you want to calculate % change, you can do this here as well, but the key question for % calculations is always "% of what?" 如果要计算%变化,也可以在此处执行,但是%计算的关键问题始终是“%of what?”。 and the answer to that question wasn't clear to me from your question. 我从您的问题中也不清楚该问题的答案。

It's worth noting that with this approach your group structure is going to get really big as you are storing your entire data set in the groups. 值得注意的是,通过这种方法,由于将整个数据集存储在组中,因此组结构将变得非常大。 If you are having performance problems while filtering, I would still recommend moving away from this approach and towards one based on a fake group. 如果您在过滤时遇到性能问题,我仍然建议您放弃这种方法,而转向基于假组的方法。

Working updated fiddle: https://jsfiddle.net/vysbxd1h/1/ 工作更新的小提琴: https//jsfiddle.net/vysbxd1h/1/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM