简体   繁体   中英

How do I get the top n elements per group within a sorted list with Ramda JS

I'm fairly new to RamdaJS and functional programming, so I was hoping to understand if there's a more efficient way to achieve the following.

Suppose I have a list of footballers. Each footballer in the list has properties for their name, their team, and their monthly pay.

const footballers = [
  {
    name: "Mesut Ozil",
    team: "Arsenal",
    pay: 1400000
  },
  {
    name: "Lionel Messi",
    team: "Barcelona",
    pay: 7300000
  },
  {
    name: "Kylian Mbappe",
    team: "PSG",
    pay: 1500000
  },
  {
    name: "Alexis Sanchez",
    team: "Manchester United",
    pay: 1990000
  },
  {
    name: "Philippe Coutinho",
    team: "Barcelona",
    pay: 2000000
  },
  {
    name: "Neymar",
    team: "PSG",
    pay: 2700000
  },
  {
    name: "Luis Suarez",
    team: "Barcelona",
    pay: 2500000
  },
  {
    name: "Antoine Griezmann",
    team: "Atletico Madrid",
    pay: 2900000
  },
  {
    name: "Gareth Bale",
    team: "Real Madrid",
    pay: 2200000
  },
  {
    name: "Cristiano Ronaldo",
    team: "Juventus",
    pay: 4100000
  }
]

I wish to order this list by pay, but restrict each team to a maximum of only two players in the final list. Currently my solution has to sort list twice. At the moment it sorts at the beginning and then the final list, but it could also sort after grouping by team. I believe there's a smarter (but still readable) solution that only requires the one sort, but I'm not sure what it is.

const byPayAsc = ascend(prop('pay')) // spare, for checking
const byPayDes = descend(prop('pay'))
const sortByPay = sort(byPayDes)

const groupedByTeam = compose(values, groupBy(prop('team')))
const topTwoPlayers = map(take(2))

const topTwoHighestPaidPlayersPerTeam = pipe( 
  sortByPay,
  groupedByTeam, 
  topTwoPlayers, 
  flatten, 
  sortByPay
)

topTwoHighestPaidPlayersPerTeam(footballers)

My current investigation has unearthed a few options:

  • I've heard about transducers, which I think might provide the benefit of only looping through the list once.
  • I also wondered if it was possible I could use my sorting function to merge the grouped lists?
  • Or I could write a reducer to loop through the list of lists and take the top 2 per team.

What is the idiomatic way of doing this with Ramda JS.

A single-sort solution

An approach which actually does avoid a second sort is certainly possible. Here's one version which does so:

 const topTwoHighestPaidPlayersPerTeam = pipe ( sort (descend (prop ('pay'))), reduce ( ({result, counts}, player, {team} = player) => ({ result: counts [team] >= 2 ? result : [...result, player], counts: {...counts, [team]: (counts[team] || 0) + 1} }), {result: [], counts: {}} ), prop ('result') ) const footballers = [{"name":"Mesut Ozil","team":"Arsenal","pay":1400000},{"name":"Lionel Messi","team":"Barcelona","pay":7300000},{"name":"Kylian Mbappe","team":"PSG","pay":1500000},{"name":"Alexis Sanchez","team":"Manchester United","pay":1990000},{"name":"Philippe Coutinho","team":"Barcelona","pay":2000000},{"name":"Neymar","team":"PSG","pay":2700000},{"name":"Luis Suarez","team":"Barcelona","pay":2500000},{"name":"Antoine Griezmann","team":"Atletico Madrid","pay":2900000},{"name":"Gareth Bale","team":"Real Madrid","pay":2200000},{"name":"Cristiano Ronaldo","team":"Juventus","pay":4100000}] const result = topTwoHighestPaidPlayersPerTeam(footballers) console .log (result)
 <script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script> <script> const {pipe, sort, descend, prop, reduce} = R </script>

We start with a standard sort by pay, and then iterate through them, keeping track of per-team counts alongside the results, adding the team whenever the count is less than our maximum of 2. We track this by accumulating an object with both the results and the counts. At the end, we just extract the results property. While this could be done with a filter instead of reduce , the function passed to filter would need to maintain the counts state, which seems like a bad fit for filter .

The slightly more obvious version of

      result: counts [team] < 2 ? [...result, player] : result, 

won't work because initially the team counts are undefined, and undefined < 2 yields false . We could choose this version instead if it seems more clear:

      result: (counts [team] || 0) < 2 ? [...result, player] : result, 

Why we probably shouldn't use this solution

This code does avoid a second sort. But it does so at a cost in readability, and, probably, for reasonable-sized collections, in actual performance.

A two-sort solution is still O (n log (n)) ; doing something twice does not change the gross performance metric. So this version is not asymptotically faster than a two-sort one. But the code is not nearly as readable as that from either Scott Christopher or Ori Drori. Unless we've measured and can point to specific bottlenecks, adding complexity to our code for performance reasons seems a total waste.

So I would recommend the solution from Ori Drori over this. And Scott Christopher also has an interesting approach as well.

But this sort of technique is still often useful for maintaining some additional state while folding a list of values.

Your idea of sorting again is valid, since grouping the players by team changes the initial sort.

If you want to skip a second sort, you'll need to keep the original index of each item, and then sort by the index anyway. so I'm not sure it's worth it.

However, for the sake of checking the idea, this snippet converts the array items to pairs of [index, player] after sorting, but before grouping by the player's team. When you flatten the groups back to an array using R.values, and R.chain (with R.take), you convert the pairs back at an object using R.fromPairs. Since ES6 traversal of integer object keys is ascending, the original order is restored, and now you get the array via calling R.values again.

 const { pipe, sort, descend, prop, toPairs, groupBy, path, values, chain, take, fromPairs } = R const topTwoHighestPaidPlayersPerTeam = pipe( sort(descend(prop('pay'))), // sort descending by pay toPairs, // convert to pairs if [index, player object] groupBy(path([1, 'team'])), // group by the team values, // get an array of an arrays chain(take(2)), // flatten and take the 1st two items of each group fromPairs, // convert to an object of objects with original index as key values // convert to an array in the correct order ) const footballers = [{"name":"Mesut Ozil","team":"Arsenal","pay":1400000},{"name":"Lionel Messi","team":"Barcelona","pay":7300000},{"name":"Kylian Mbappe","team":"PSG","pay":1500000},{"name":"Alexis Sanchez","team":"Manchester United","pay":1990000},{"name":"Philippe Coutinho","team":"Barcelona","pay":2000000},{"name":"Neymar","team":"PSG","pay":2700000},{"name":"Luis Suarez","team":"Barcelona","pay":2500000},{"name":"Antoine Griezmann","team":"Atletico Madrid","pay":2900000},{"name":"Gareth Bale","team":"Real Madrid","pay":2200000},{"name":"Cristiano Ronaldo","team":"Juventus","pay":4100000}] const result = topTwoHighestPaidPlayersPerTeam(footballers) console.log(result)
 <script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>

What you have here is close to a merge sort , with the initial sorting performed per team and then merging the sorted lists of players.

 const { descend, prop, take, sort, reduceBy, pipe, values, reduce } = R const sorter = descend(prop('pay')) // this is used by `R.reduceBy` below to build up the sorted list with a max size const addToGroup = (size, compare) => (group, a) => take(size, sort(sorter, [a, ...group])) const sortByTeam = reduceBy(addToGroup(2, sorter), [], prop('team')) // recursively merges two sorted(!) lists by the given comparator const mergeListsBy = compare => { const go = (xs, ys) => xs.length == 0 ? ys : ys.length == 0 ? xs : compare(xs[0], ys[0]) < 0 ? [xs[0], ...go(xs.slice(1), ys)] : [ys[0], ...go(xs, ys.slice(1))] return go } const run = pipe(sortByTeam, values, reduce(mergeListsBy(sorter), [])) //// const footballers = [{"name": "Mesut Ozil", "pay": 1400000, "team": "Arsenal"}, {"name": "Lionel Messi", "pay": 7300000, "team": "Barcelona"}, {"name": "Kylian Mbappe", "pay": 1500000, "team": "PSG"}, {"name": "Alexis Sanchez", "pay": 1990000, "team": "Manchester United"}, {"name": "Philippe Coutinho", "pay": 2000000, "team": "Barcelona"}, {"name": "Neymar", "pay": 2700000, "team": "PSG"}, {"name": "Luis Suarez", "pay": 2500000, "team": "Barcelona"}, {"name": "Antoine Griezmann", "pay": 2900000, "team": "Atletico Madrid"}, {"name": "Gareth Bale", "pay": 2200000, "team": "Real Madrid"}, {"name": "Cristiano Ronaldo", "pay": 4100000, "team": "Juventus"}] console.log(run(footballers))
 <script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>

For sake of simplicity, I'd just write a custom filter function...

 const playersPerTeam = (n) => { const teams = {}; const predicate = (({ team }) => { teams[team] = (teams[team] || 0) + 1; return teams[team] <= n; }); return R.filter(predicate); }; const fn = R.pipe( R.sort(R.descend(R.prop('pay'))), playersPerTeam(2), ); // ------ const data = [ { name: "Mesut Ozil", team: "Arsenal", pay: 1400000 }, { name: "Lionel Messi", team: "Barcelona", pay: 7300000 }, { name: "Kylian Mbappe", team: "PSG", pay: 1500000 }, { name: "Alexis Sanchez", team: "Manchester United", pay: 1990000 }, { name: "Philippe Coutinho", team: "Barcelona", pay: 2000000 }, { name: "Neymar", team: "PSG", pay: 2700000 }, { name: "Luis Suarez", team: "Barcelona", pay: 2500000 }, { name: "Antoine Griezmann", team: "Atletico Madrid", pay: 2900000 }, { name: "Gareth Bale", team: "Real Madrid", pay: 2200000 }, { name: "Cristiano Ronaldo", team: "Juventus", pay: 4100000 }, ]; console.log( fn(data), );
 <script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js" integrity="sha256-xB25ljGZ7K2VXnq087unEnoVhvTosWWtqXB4tAtZmHU=" crossorigin="anonymous"></script>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM