简体   繁体   中英

Sort array of objects, then group by id (JavaScript)

I have an array of objects that requires some unconventional sorting. Each object contains an id string and a num int. Some unsorted dummy data:

[{"id":"ABC","num":111},
{"id":"DEF","num":130},
{"id":"XYZ","num":115},
{"id":"QRS","num":98},
{"id":"DEF","num":119},
{"id":"ABC","num":137},
{"id":"LMN","num":122},
{"id":"ABC","num":108}]

I need to sort ascending by num - BUT, if an id appears more than once, additional records for that id should "float up" in position to reside below its sibling with the next smallest num .

The end result would be:

[{"id":"QRS","num":98},
{"id":"ABC","num":108},
{"id":"ABC","num":111},
{"id":"ABC","num":137},
{"id":"XYZ","num":115},
{"id":"DEF","num":119},
{"id":"DEF","num":130},
{"id":"LMN","num":122}]

The actual array could contain 15k+ records, so any efficient solutions would be greatly appreciated. A .sort(function(a,b) {...}) with some nested "ifs" works fine to get a basic sort in place, but I'm stumped on the "float up" logic. Thanks in advance.

EDIT: what I have so far (basic nested sort):

const sortedData = origData.sort(function(a, b) {
  if (a.num === b.num) {
    if (a.id === b.id) {
      return a.id.localeCompare(b.id);
    }
  }
  return a.num - b.num;
});

One approach would be to

  • first group by id
  • then sort each group by num
  • then sort the groups by min(num)
  • then concat the groups

 let data = [{"id":"ABC","num":111}, {"id":"DEF","num":130}, {"id":"XYZ","num":115}, {"id":"QRS","num":98}, {"id":"DEF","num":119}, {"id":"ABC","num":137}, {"id":"LMN","num":122}, {"id":"ABC","num":108}]; const groupById = (acc, item) => { const id = item.id; if(id in acc){ acc[id].push(item); }else{ acc[id] = [item]; } return acc; }; const sortByNum = (a,b) => a.num - b.num; const sortByMinNum = (a,b) => a[0].num - b[0].num; const groups = Object.values(data.reduce(groupById, {})) .map(group => group.sort(sortByNum)) .sort(sortByMinNum); console.log([].concat(...groups));
 .as-console-wrapper{top:0;max-height:100%!important}

Another approach would be to

  • first determine the minimal num by id
  • then sort first by minNum and num

 let data = [{"id":"ABC","num":111}, {"id":"DEF","num":130}, {"id":"XYZ","num":115}, {"id":"QRS","num":98}, {"id":"DEF","num":119}, {"id":"ABC","num":137}, {"id":"LMN","num":122}, {"id":"ABC","num":108}]; const minNumById = data.reduce((acc, item) => { const id = item.id; if(id in acc){ acc[id] = Math.min(acc[id], item.num); }else{ acc[id] = item.num; } return acc; }, {}); data.sort((a, b) => minNumById[a.id] - minNumById[b.id] || a.num - b.num); console.log(data);
 .as-console-wrapper{top:0;max-height:100%!important}

Edit: wow. Looking two years later at this code is weird.

Reading these snippets, I realize there's a flaw in the second approach. If multiple IDs have the same minNum then the code may mix the blocks as if they were the same ID.
A fix, if that's an issue with your data:

data.sort((a, b) => minNumById[a.id] - minNumById[b.id] || a.id.localeCompare(b.id) || a.num - b.num);

sorting them by minNum , then by id and then by num .

But back to the reason for the update:

what would be the reasons to choose one over the other

Technically, the first approach generates potentially a lot of intermediate objects (especially with a lot of IDs and few entries per ID) but overall sorting may be faster, as it is sorting smaller lists.
While the second approach should be less wasteful on memory.

But neither should be significant on regular devices, not untill the lists get massive; you'd have to test it with your concrete data wether there is any reason to optimize here.

More important: You, as the dev working with the code, should be comfortable with it. Unless there is an actual performance bottleneck here, you should choose the approach that you feel more comfortable with, and that is easier for you to understand and scan through.
Shaving off a few microseconds vs creating a bug because you don't understand the code you use plus the time you need to debug/fix that. What's more significant?

Here's what I came up with. You'll need to first group by id and store the grouped ids to an array. Then, sort by num asc and take into account any grouped ids:

EDIT : Fixed asc ordering for grouped ids

 var data = [{"id":"ABC","num":111}, {"id":"DEF","num":130}, {"id":"XYZ","num":115}, {"id":"QRS","num":98}, {"id":"DEF","num":119}, {"id":"ABC","num":137}, {"id":"LMN","num":122}, {"id":"ABC","num":108}]; const sortArray = arr => { let matchingIds = []; const sorted = arr.sort( (a,b) => { if(a.id === b.id){ matchingIds.push(a.id); return 0; }else{ return 1; } }).sort( (a,b) => { if(matchingIds.indexOf(a.id) > -1 && matchingIds.indexOf(b.id) > -1 && a.id === b.id) { return a.num - b.num; } if(matchingIds.indexOf(a.id) > -1 || matchingIds.indexOf(b.id) > -1) { return 0; } return a.num - b.num; }); console.log(sorted); } sortArray(data);

  • I first created a map.

  • This map will basically have id as a key and all it's values in an array.

  • Sorted the individual array for each key of the map.

  • Now, collected all these in a new collection of objects, sorted them again comparing only first element.

  • Now, just loop over the new collection and push them into a resultant array.

 var collection = [ { "id": "ABC", "num": 111 }, { "id": "DEF", "num": 130 }, { "id": "XYZ", "num": 115 }, { "id": "QRS", "num": 98 }, { "id": "DEF", "num": 119 }, { "id": "ABC", "num": 137 }, { "id": "LMN", "num": 122 }, { "id": "ABC", "num": 108 } ]; var map = {}; for (var i = 0; i < collection.length; ++i) { if (map[collection[i].id] === undefined) { map[collection[i].id] = []; } map[collection[i].id].push(collection[i].num); } var new_collection = []; for (var key in map) { map[key].sort(function(a, b) { return a - b; }); var new_obj = {}; new_obj[key] = map[key]; new_collection.push(new_obj); } new_collection.sort(function(a, b) { var key1 = Object.keys(a)[0]; var key2 = Object.keys(b)[0]; return a[key1][0] - b[key2][0]; }); var result = []; for (var i = 0; i < new_collection.length; ++i) { var curr_obj = new_collection[i]; var curr_key = Object.keys(curr_obj)[0]; for (var j = 0; j < curr_obj[curr_key].length; ++j) { var new_obj = {}; new_obj['id'] = curr_key; new_obj['num'] = curr_obj[curr_key][j]; result.push(new_obj); } } console.log(result);

I am a big fan of the functional programming library Ramda . (Disclaimer: I'm one of its authors.) I tend to think in terms of simple, reusable functions.

When I think of how to solve this problem, I think of it through a Ramda viewpoint. And I would probably solve this problem like this:

 const {pipe, groupBy, prop, map, sortBy, values, head, unnest} = R; const transform = pipe( groupBy(prop('id')), map(sortBy(prop('num'))), values, sortBy(pipe(head, prop('num'))), unnest ) const data = [{"id": "ABC", "num": 111}, {"id": "DEF", "num": 130}, {"id": "XYZ", "num": 115}, {"id": "QRS", "num": 98}, {"id": "DEF", "num": 119}, {"id": "ABC", "num": 137}, {"id": "LMN", "num": 122}, {"id": "ABC", "num": 108}] console.log(transform(data))
 <script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.25.0/ramda.js"></script>

I think that is fairly readable, at least once you understand that pipe creates a pipeline of functions, each handing its result to the next one.

Now, there is often not a reason to include a large library like Ramda to solve a fairly simple problem. But all the functions used in that version are easily reusable. So it might make sense to try to create your own versions of these functions and keep them available to the rest of your application. In fact, that's how libraries like Ramda actually get built.

So here is a version that has simple implementations of those functions, ones you might place in a utility library:

 const groupBy = (fn) => (arr) => arr.reduce((acc, val) => (((acc[fn(val)] || (acc[fn(val)] = [])).push(val)), acc), {}) const head = (arr) => arr[0] const mapObj = (fn) => (obj) => Object.keys(obj).reduce((acc, val) => (acc[val] = fn(obj[val]), acc), {}) const pipe = (...fns) => (arg) => fns.reduce((a, f) => f(a), arg) const prop = (name) => (obj) => obj[name] const values = Object.values const unnest = (arr) => [].concat(...arr) const sortBy = (fn) => (arr) => arr.slice(0).sort((a, b) => { const aa = fn(a), bb = fn(b) return aa < bb ? -1 : aa > bb ? 1 : 0 }) const transform = pipe( groupBy(prop('id')), mapObj(sortBy(prop('num'))), values, sortBy(pipe(head, prop('num'))), unnest ) const data = [{"id": "ABC", "num": 111}, {"id": "DEF", "num": 130}, {"id": "XYZ", "num": 115}, {"id": "QRS", "num": 98}, {"id": "DEF", "num": 119}, {"id": "ABC", "num": 137}, {"id": "LMN", "num": 122}, {"id": "ABC", "num": 108}] console.log(transform(data))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM