简体   繁体   中英

Whats the best way to remove a column from a table data?

Consider the following data:

[
  { time: '5:00', name: 'bran', fruit: 'pear', numEaten: 3},
  { time: '5:00', name: 'rickon', fruit: 'apple', numEaten: 2},
  { time: '6:00', name: 'bran', fruit: 'apple', numEaten: 5},
  { time: '6:00', name: 'rickon', fruit: 'grape', numEaten: 1},
  { time: '6:00', name: 'bran', fruit: 'pear', numEaten: 2},
  { time: '6:00', name: 'eddard', fruit: 'pear', numEaten: 2},
  { time: '7:00', name: 'rickon', fruit: 'apple', numEaten: 7}
]

What I want to do is remove a column, and add the 'numEaten' of all the rows which have matching columns. So imagine: you don't actually care when a fruit is eaten, you only want to know who ate how many of what. So the output table would look like:

[
  {name: 'bran', fruit: 'pear', numEaten: 5},
  {name: 'bran', fruit: 'apple', numEaten: 2},
  {name: 'rickon', fruit: 'apple', numEaten: 9},
  {name: 'rickon', fruit: 'grape', numEaten: 1},
  {name: 'eddard', fruit: 'pear', numEaten: 2},
]

I have been looking over the various javascript array prototype functions and extensions in underscore, but I can't see a particularly elegant way to do this. I would like to have a function with prototype:

function aggregate(data, column, aggregateColumn) // aggregate(data, 'time', 'numEaten')

that would perform this operation. Conceptually, I was considering running _.groupBy() for every column that is not column or aggregateColumn , but it seems a bit hacky to make that work. Is there a better way?

Edit

Seems like there isn't a single line solution for this one: posting what I came up with, after incorporating feedback from solutions below. Note unlike the original question, this takes the column(s) to keep , not remove, and works for any schema.

  var aggregate = function(data, aggregateColumn, keepColumns) {
    keepColumns = keepColumns || [];
    if(!Array.isArray(keepColumns)) {
      keepColumns = [ keepColumns ];
    }

    var removeColumns = _.difference(_.keys(data[0]), keepColumns.concat(aggregateColumn));
    var grouped = _.groupBy(data, function(d) {
      return _.reduce(keepColumns, function(o, col) {
        return o + d[col] + '-';
      }, '');      
    });

    return _.map(grouped, function(mapData) {
      var reduced = _.reduce(keepColumns, function(o, col) {
          o[col] = mapData[0][col];
          return o;
        }, {}
      );

      reduced[aggregateColumn] = _.reduce(mapData, function(o, aggrData) {
          return o + aggrData[aggregateColumn];
        }, 0
      );

      return reduced;
    });
  }

Here's one way to do it in underscore

Let's define initial data like

var data = [
  { time: '5:00', name: 'bran', fruit: 'pear', numEaten: 3},
  { time: '5:00', name: 'rickon', fruit: 'apple', numEaten: 2},
  { time: '6:00', name: 'bran', fruit: 'apple', numEaten: 5},
  { time: '6:00', name: 'rickon', fruit: 'grape', numEaten: 1},
  { time: '6:00', name: 'bran', fruit: 'pear', numEaten: 2},
  { time: '6:00', name: 'eddard', fruit: 'pear', numEaten: 2},
  { time: '7:00', name: 'rickon', fruit: 'apple', numEaten: 7}
]

Then, Create groups based on name and fruit by joining them.

var groups = _.groupBy(data, function(value){
        return value.name+ '#' + value.fruit;
    });

We will use this custom sum function later while aggregating.

function sum(numbers) {
    return _.reduce(numbers, function(result, current) {
        return result + parseFloat(current);
    }, 0);
}

Now, map the groups by extracting numEaten and taking their sum

var out = _.map(groups, function(group){
        return {
            name: group[0].name,
            fruit: group[0].fruit,
            numEaten: sum(_.pluck(group, 'numEaten'))
        }
    });

And, finally we have the output like --

out
[
  {name: 'bran', fruit: 'pear', numEaten: 5},
  {name: 'bran', fruit: 'apple', numEaten: 5},
  {name: 'rickon', fruit: 'apple', numEaten: 9},
  {name: 'rickon', fruit: 'grape', numEaten: 1},
  {name: 'eddard', fruit: 'pear', numEaten: 2},
]

A generic solution would be easy with pure JavaScript but I would like to provide this solution using underscore, cause It feels exciting sometimes!

Since underscore doesn't provide an appropriate function to remove duplicates, I use _.uniq mixed with JSON.stringify function.

Here is the successfully tested aggregate function

  var objs = [
    { time: '5:00', name: 'bran', fruit: 'pear', numEaten: 3},
    { time: '6:00', name: 'bran', fruit: 'pear', numEaten: 2},  
    { time: '6:00', name: 'bran', fruit: 'apple', numEaten: 5},  
    { time: '5:00', name: 'rickon', fruit: 'apple', numEaten: 2},
    { time: '7:00', name: 'rickon', fruit: 'apple', numEaten: 7},  
    { time: '6:00', name: 'rickon', fruit: 'grape', numEaten: 1},  
    { time: '6:00', name: 'eddard', fruit: 'pear', numEaten: 2}
    ];

function aggregate(data, column, aggregateColumn){
var res=[];
_.map(data, function(item){
            var comparer={},
                compared={};

            for(var k in item){
                if(k!=column){
                compared[k]=item[k];
                if(k!=aggregateColumn)
                    comparer[k]=item[k];                    
                }
            }
_.each(_.where(_.without(data,item), comparer),function(aggregable){                    
                compared[aggregateColumn]+=aggregable[aggregateColumn];
                return compared;
                });
                res.push(compared);
            });
    return _.uniq(res,function(item){return JSON.stringify(item);})
}

    ///usage
    var o=aggregate(objs, 'time', 'numEaten');
    console.log({'o':o});

Have a look at this Fiddle

The fact that you're talking of "columns" suggests that you have a table in mind when in fact you're dealing with an array of string maps.
There is no "beautiful" or out-of-the-box solution to your problem (not only, but also) due to the fact that JavaScript is prototype-based.

You can choose between a for loop and Array.forEach. I prefer the former.
Also, I'm returning a new array here instead of modifying the old one in-place.

function aggregate(data, column, aggregateColumn)
{
    var array = [];
    // Just work the array
    for(var i = 0; i < data.length; i++)
    {
        var currentOld = data[i];
        var found = false;
        // Label the loop, so we can control it
        outside:
        // Check if the current type already exists in the new array
        for(var j = 0; j < array.length; j++)
        {
            var currentNew = array[j];
            // Check if all properties match
            for(var property in currentOld)
            {
                // Skip properties that match column or aggregateColumn
                if(property == column || property == aggregateColumn)
                {
                    continue;
                }
                // Now check if their values match
                if(currentOld[property] != currentNew[property])
                {
                    // If they don't match, continue the outer loop
                    continue outside;
                }
            }
            // At this point, all properties matched, so we aggregate
            currentNew[aggregateColumn] += currentOld[aggregateColumn];
            // Set the flag to indicate that we found it
            found = true;
            // And end the loop
            break;
        }
        // If the current type is not yet in the new array, we need to put it there
        if(!found)
        {
            // Create a copy of it (assuming your data are trivial objects)
            var copy = JSON.parse(JSON.stringify(currentOld));
            // Remove your "column"
            delete copy[column];
            // And add it
            array.push(copy);
        }
    }
    return array;
}

Testing the function outputs the same array you wish, only in different order, since it keeps the order from the original array rather than sorting it.
I assume you know how to sort an array though. ;)

Using the sum function from John Galt's excellent answer, here's a generic version

function aggregate(data, aggregateColumn, keepColumns){

   var groups = _.groupBy(data, function(item){
      return _.values(_.pick(item, keepColumns)).join('#')
   });

   return _.map(groups, function(group){
       return _.extend( _.pick(group[0], keepColumns), 
          _.object([aggregateColumn], [sum(_.pluck(group, aggregateColumn))]));
   }); 
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM