简体   繁体   中英

Find optimal set of substrings in given string

I'm trying to find optimal set of strings for given string.

Given string: "FEEJEEDAI"

Substrings values:

FE - 1
JE - 2
JEE - 3
AI - 4
DAI - 6

Possible combinations:

1) [FE-JE-DAI] - 1+2+6 = 9
2) [FE-JEE-DAI] - 1+3+6 = 10
3) [FE-JE-AI] - 1+3+4 = 8

OPTIMAL COMBINATION - 2) [FE-JEE-DAI] scores 10

I think it should go something like this:

1) check if string contain particular substring:

var string = "FEEJEEDAI", substring = "JE"; string.indexOf(substring) !== -1;

2) If true find it's index

var subStringIndex = string.indexOf(substring)

3) Create new tempString to build combination and 'cut off' substring from string

var tempString = string.slice(subStringIndex, substring.length)

4) Iterate through string and find optimal tempString

I don't know how to build it into loop and and handle situation JE vs JEE, AI vs DAI

Basically, you could use an iterative and recursive approach for getting all possible substrings of the string.

This solution is splitted into 3 parts

  1. Preparation
  2. Collecting parts
  3. Calculating score and create result set

Preparation

At start, all substrings of the string are collected in the indices object. The key is the index and the value is an object with a limit, which is the smallest length of the strings in the pattern array. The pattern array contains the index and the found substrings beginning at that index.

indices object from the first example

 { 0: { limit: 2, pattern: [ { index: 0, string: "FE" } ] }, 3: { limit: 2, pattern: [ { index: 3, string: "JE" }, { index: 3, string: "JEE" } ] }, /* ... */ } 

Collecting parts

The main idea is to start at index zero with an empty array for collecting substrings.

To check, which parts are together in a group, you need to get the first substring at a given index or the next close one, then take the limit property, which is the length of the shortest substring, add the index and take it as the maximum index for searching group members.

From the second example the first group consists of 'FE' , 'EE' and 'EEJ'

 string comment ---------- ------------------------------------- 01 2345678 indices FE|EJEEDAI FE| matching pattern FE at position 0 E|E matching pattern EE at position 1 E|EJ matching pattern EEJ at position 1 ^^ all starting substrings are in the same group 

With that group, a new recursion is invoked, with an adjusted index and with the substring concatinated to the parts array.

Calculating score and create result set

If no more substring are found, the parts are joined and the score is calculated and pushed to the result set.

Interpreting the result

  [ { parts: "0|FE|3|JE|6|DAI", score: 9 }, /* ... */ ] 

parts are a combination of indices and matching strings at the position

  0|FE|3|JE|6|DAI ^ ^^ at index 0 found FE ^ ^^ at index 3 found JE ^ ^^^ at index 6 found DAI 

score is calculated with the given weights of the substrings

 substring weight --------- ------ FE 1 JE 2 DAI 6 --------- ------ score 9 

The example three returns 11 unique combinations.

 function getParts(string, weights) { function collectParts(index, parts) { var group, limit; while (index < string.length && !indices[index]) { index++; } if (indices[index]) { group = indices[index].pattern; limit = index + indices[index].limit; while (++index < limit) { if (indices[index]) { group = group.concat(indices[index].pattern); } } group.forEach(function (o) { collectParts(o.index + o.string.length, parts.concat(o.index, o.string)); }); return; } result.push({ parts: parts.join('|'), score: parts.reduce(function (score, part) { return score + (weights[part] || 0); }, 0) }); } var indices = {}, pattern, result = []; Object.keys(weights).forEach(function (k) { var p = string.indexOf(k); while (p !== -1) { pattern = { index: p, string: k }; if (indices[p]) { indices[p].pattern.push(pattern); if (indices[p].limit > k.length) { indices[p].limit = k.length; } } else { indices[p] = { limit: k.length, pattern: [pattern] }; } p = string.indexOf(k, p + 1); } }); collectParts(0, []); return result; } console.log(getParts("FEEJEEDAI", { FE: 1, JE: 2, JEE: 3, AI: 4, DAI: 6 })); console.log(getParts("FEEJEEDAI", { FE: 1, JE: 2, JEE: 3, AI: 4, DAI: 6, EEJ: 5, EJE: 3, EE: 1 })); console.log(getParts("EEEEEE", { EE: 2, EEE: 3 })); 
 .as-console-wrapper { max-height: 100% !important; top: 0; } 

If you're slicing out the substring when you find them, since certain substrings are substrings of other substrings, search for the biggest ones first. For example, if you didn't find DAI, and you find AI, it can't be a part of DAI. You want to test for each substring, so you can put each substring into an array and loop through the array.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM