简体   繁体   English

加载 CSV 文件作为地图(D3 和 JavaScript)

[英]Loading in a CSV file as a Map (D3 and JavaScript)

I've looked around JavaScript and D3's documentation, but couldn't find anything that helps me out...我查看了 JavaScript 和 D3 的文档,但找不到任何可以帮助我的东西......

Is it possible to load in a CSV file that looks like so:是否可以加载如下所示的 CSV 文件:

header, header
string1, string
string2, string
...
stringN, string

And store into a Map ?并存储到Map 中 Ideally using D3's CSV uploaded?理想情况下使用 D3 的 CSV 上传?

d3.csv("demoCSVOne.csv", function(errorOne, one) {
    d3.csv("demoCSVTwo.csv", function(errorTwo, two) {

    // do something

    }
}

CSV example CSV 示例

String, Integer
one, 2345
two, 34536
three, 24536

For Mark I'm trying to achieve this calculation - get an average value for that from multiple CSVs that have been selected.对于马克,我正在尝试实现此计算 - 从已选择的多个 CSV 中获取平均值。 Where a, b, c, etc represent the value for a key:其中 a、b、c 等表示键的值:

[(a_csv1 + a_csv2 + a_csv3)/3]
[(b_csv1 + b_csv2 + b_csv3)/3]
[(c_csv1 + c_csv2 + c_csv3)/3]

These averages would then need to be stored in a new array, a long with the key that the averages represent.然后需要将这些平均值存储在一个新数组中,一个带有平均值代表的键的 long。 I'm aiming for it to look like this:我的目标是让它看起来像这样:

key, average
     a, 123
     b, 456
     c, 789

Here's how I would do it.这是我将如何做到的。 Note, I just used a JavaScript object as my map, instead of an ES6 Map object.注意,我只是使用了一个 JavaScript 对象作为我的地图,而不是一个 ES6 Map 对象。

d3.csv('csv1.csv', function(e1, one) {

  d3.csv('csv2.csv', function(e2, two) {

    // our final map
    var aveMap = {};

    // concat the two csv arrays together
    one.concat(two).map((d) => {
      if (!aveMap[d.String]) aveMap[d.String] = {
        values: []
      };
      // build array of values by key
      aveMap[d.String].values.push(+d.Integer);
    });

    // loop and calculate mean
    Object.keys(aveMap).map((k) => {
      aveMap[k].mean = d3.mean(aveMap[k].values);
    });     

  });
});

Produces a final data structure as:生成最终数据结构为:

{
  "one": {
    "values": [
      2345,
      2323
    ],
    "mean": 2334
  },
  "two": {
    "values": [
      34536,
      45456
    ],
    "mean": 39996
  },
  "three": {
    "values": [
      24536,
      56567
    ],
    "mean": 40551.5
  }
}

See it running here .看到它在这里运行。

Edits for Comments评论编辑

Holding the extra values property in memory isn't really making this code slower.在内存中保存额外的 values 属性并没有真正使这段代码变慢。 If it's not performant, there's two reasons: you have lots of CSV files or they are huge CSV files.如果性能不佳,有两个原因:您有很多 CSV 文件,或者它们是巨大的 CSV 文件。 For performance, I'd switch to something like this:为了性能,我会切换到这样的:

var q = d3.queue();
['csv1.csv', 'csv2.csv'].map((c) => {
  q.defer(d3.csv, c);
});

q.awaitAll(function(d, csvs){
    var arr = d3.merge(csvs),
        aveMap = {};

    arr.map((d,i) => {
      if (!aveMap[d.String]) {
        aveMap[d.String] = {
          sum: 0,
          count: 0
        };
      }
      var obj = aveMap[d.String];
      obj.sum += +d.Integer;
      obj.count += 1;

      if ( obj.count === csvs.length ){
       obj.mean = obj.sum / obj.count;
      }
    });

    console.log(aveMap);
});

First, by using d3.queue , you are downloading the csv files concurrently instead of doing them one after the next.首先,通过使用d3.queue ,您可以同时下载 csv 文件,而不是一个接一个地下载。 Second, you can adjust the input to .defer to only download the files the user actually wants.其次,您可以将输入调整为.defer以仅下载用户实际需要的文件。 Third, you'll notice that I'm now calculating the average inside the first loop.第三,您会注意到我现在正在计算第一个循环内的平均值。 If these are large datasets, you want to minimize the looping over them.如果这些是大型数据集,您希望尽量减少对它们的循环。 Fourth, I'm now summing as I go.第四,我现在正在总结。 Of course, this re-factor assumes that each key exists in each csv file once.当然,这个重构假设每个密钥在每个 csv 文件中都存在一次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM