Algorithm for finding optimal multiple partitionings based on some costs

Question

I have a situation in which I need to find optimal split positions in an array based on some costs. The problem goes like this:

As input I have an array of events ordered by an integer timestamp and as output I want an array of indexes which split the input array into many parts. The output array needs to be optimal (more on this below).

struct e {
    int Time;
    // other values
}

Example Input:  [e0, e1, e2, e3, e4, e5, ..., e10]
Example output: [0, 2, 6, 8] (the 0 at the start is always there)

Using the above examples I can use the split indices to partition the original array into 5 subarrays like so:

[ [], [e0, e1], [e2, e3, e4, e5], [e6, e7], [e8, e9, e10] ]

The cost of this example solution is the total cost of "distances" between the subarrays:

double distance(e[] arr1, e[] arr2) {
    // return distance from arr1 to arr2, order matters so non-euclidean
}

total cost = distance([], [e0, e1]) + distance([e0, e1], [e2, e3, e4, e5]) + ...

At this point it is helpful to understand the actual problem.

The input array represents musical notes at some time (ie a MIDI file) and I want to split the MIDI file into optimal guitar fingerings. Hence each subarray of notes represents a chord (or a melody grouped together in a single fingering). The distance between two subarrays represents the difficulty of moving from one fingering pattern to another. The goal is to find the easiest (optimal) way to play a song on a guitar.

I have not yet proved it but to me this looks like an NP-Complete or NP-Hard problem. Therefore it could be helpful if I could reduce this to another known problem and use a known divide and conquer algorithm. Also, one could solve this with a more traditional search algorithm (A* ?). It could be efficient because we can filter out bad solutions much faster than in a regular graph (because the input is technically a complete graph since each fingering can be reached from any other fingering).

I'm not able to decide what the best approach would be so I am currently stuck. Any tips or ideas would be appreciated.

Answer 1

It's probably not NP-hard.

Form a graph whose nodes correspond one-to-one to (contiguous) subarrays. For each pair of nodes u, v where u's right boundary is v's left, add an arc from u to v whose length is determined by distance() . Create an artificial source with an outgoing arc to each node whose left boundary is the beginning. Create an artificial sink with an incoming arc from each node whose right boundary is the end.

Now we can find a shortest path from the source to the sink via the linear-time (in the size of the graph, so cubic in the parameter of interest) algorithm for directed acyclic graphs.

Answer 2

This is a bit late but I did solve this problem. I ended up using a slightly modified version of Dijkstra for this but any pathfinding algo could work. I tried A* as well but finding a good heuristic proved to be extremely difficult because of the non-euclidean nature of the problem.

The main changes to Dijkstra are that at some point I can already tell that some unvisited nodes cannot provide an optimal result. This speeds up the algorithm by a lot which is also one of the reasons I didn't opt for A*.

The algorithm essentially works like this:

search()
  visited = set()
  costs = map<node, double>()
  
  // add initial node to costs

  while costs is not empty:
    node = node with minimum cost in costs

    if current.Index == songEnd:
      // backtrack from current to get fingering for the song
      return solution

    visited.add(node)

    foreach neighbour of node:
      if visited node:
        continue
      
      newCost = costs[node] + distance(node, neighbour)
      add neighbour with newCost to costs

    
  // we can remove nodes that have a higher cost but
  // which have a lower index than our current node
  // this is because every fingering position is reachable
  // from any fingering positions
  // therefore a higher cost node which is not as far as our current node
  // cannot provide an optimal solution
  remove unoptimal nodes from costs

  remove node from costs

// if costs ends up empty then it is impossible to play this song
// on guitar (e.g. more than 6 notes played at the same time)

The magic of this algorithm happens in fetching the neighbours and calculating the distance between two nodes but those are irrelevant for this question.

Algorithm for finding optimal multiple partitionings based on some costs

Question

2 answers

solution1
2 2021-10-01 13:29:05

solution2
0 ACCPTED 2021-11-15 16:36:14

Algorithm for finding optimal multiple partitionings based on some costs

Question

2 answers

solution1 2 2021-10-01 13:29:05

solution2 0 ACCPTED 2021-11-15 16:36:14

solution1
2 2021-10-01 13:29:05

solution2
0 ACCPTED 2021-11-15 16:36:14