简体   繁体   中英

How to split the String based on array elements into array retaining the array the split word in javascript

I have a string

sReasons =  "O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt";

and I need to split the above string based on the separator array

const separator = ["O9", "EO", "HJ", "J8"];

Where first 2 characters(O9) represnet code, next 4 another code(C270) & next 4 the character(0021) length of the String which is Not eligible for SDWC

Where the separator codes are unique, with 2 capital letters and will not be repeated in textMessage except inEligType

I need to create a json of the format

{
    {inEligType: "O9", msgCode: "C270", msgLen: "0021", textMsg: "Not eligible for SDWC"},
    {inEligType: "EO", msgCode: "C390", msgLen: "0015", textMsg: "Service upgrade"},
    {inEligType: "HJ", msgCode: "C390", msgLen: "0015", textMsg: "Service upgrade"},
    {inEligType: "J8", msgCode: "C500", msgLen: "0016", textMsg: "Delivery Attempt"}
}

I'm basically failing at the splitting the string itself based on the array given, I tried the following

sReasons =  "O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt";    
const separator = ["O9", "EO", "HJ", "J8"];

function formatReasons(Reasons: string) {
var words: any[] = Reasons.split(this.spearator); 
for(let word in words)
    {
       console.log(word) ;
    }
}
var result = formatReasons(sHdnReasonsCreate);
console.log("Returned Result: "+result);

But it gives me result as

["O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt"]length: 1__proto__: Array(0)

Returned Address is: undefined

My Regex-based approach:

sReasons =  "O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt";    
const separator = ["O9", "EO", "HJ", "J8"];

// build the regex based on separators
let regexPattern = '^';
separator.forEach(text => {
    regexPattern += `${text}(.*)`;
});
regexPattern += '$';

// match the reasons
let r = new RegExp(regexPattern);
let matches = sReasons.match(r);

// prepare to match each message
let msgMatcher = new RegExp('^(?<msgCode>.{4})(?<msgLen>.{4})(?<textMsg>.*)$');
let output = [];

for (let i=1; i<matches.length; i++) {
    // match the message
    const msg = matches[i].match(msgMatcher);

    // store
    let item = msg.groups;
    item.inEligType = separator[i-1];
    output.push(item);
}

console.log(JSON.stringify(output, null, 2));

Produces

[
  {
    "msgCode": "C270",
    "msgLen": "0021",
    "textMsg": "Not eligible for SDWC",
    "inEligType": "O9"
  },
  {
    "msgCode": "C090",
    "msgLen": "0015",
    "textMsg": "Service upgrade",
    "inEligType": "EO"
  },
  {
    "msgCode": "C390",
    "msgLen": "0015",
    "textMsg": "Service upgrade",
    "inEligType": "HJ"
  },
  {
    "msgCode": "C5HJ",
    "msgLen": "0016",
    "textMsg": "Delivery Attempt",
    "inEligType": "J8"
  }
]

It may well be that textMsg field, nor any other field, will never contain the two-letter strings you are using for the inEligType field. But are you absolutely sure of that? The data format looks to me like it really wants someone to parse it by substrings of certain lengths; why even have a msgLen field if you could just split based on delimiters? What if the list of inEligType codes changes in the future?

For these reasons I strongly recommend that you parse by substring lengths and not by delimiter matching. Here's one possible way to do that:

function formatReasons(reasons: string) {
  const ret = []
  while (reasons) {
    const inEligType = reasons.substring(0, 2);
    reasons = reasons.substring(2);
    const msgCode = reasons.substring(0, 4);
    reasons = reasons.substring(4);
    const msgLen = reasons.substring(0, 4);
    reasons = reasons.substring(4);
    const textMsg = reasons.substring(0, +msgLen);
    reasons = reasons.substring(+msgLen);
    ret.push({ inEligType, msgCode, msgLen, textMsg });
  }
  return ret;
}

You can verify that it produces the expected output for your example sReasons string:

const formattedReasons = formatReasons(sReasons);
console.log(JSON.stringify(formattedReasons, undefined, 2));
/* [
  {
    "inEligType": "O9",
    "msgCode": "C270",
    "msgLen": "0021",
    "textMsg": "Not eligible for SDWC"
  },
  {
    "inEligType": "EO",
    "msgCode": "C090",
    "msgLen": "0015",
    "textMsg": "Service upgrade"
  },
  {
    "inEligType": "HJ",
    "msgCode": "C390",
    "msgLen": "0015",
    "textMsg": "Service upgrade"
  },
  {
    "inEligType": "J8",
    "msgCode": "C5HJ",
    "msgLen": "0016",
    "textMsg": "Delivery Attempt"
  }
] */

Note that the implementation above does not check that the string is properly formatted; right now if you pass garbage in, you get garbage out. If you want more safety you could do runtime checks and throw errors if you, say, run off the end of the reasons string unexpectedly, or find a msgLen field that doesn't represent a number. And one could refactor so that there's no repetition of code like const s = reasons.substring(0, n); reasons = reasons.substring(n) const s = reasons.substring(0, n); reasons = reasons.substring(n) . But the basic algorithm is there.

Playground link to code

Another option with RegExp with less code

 // Your data const data = "O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt"; // Set your data splitters from array const spl = ["O9", "EO", "HJ", "J8"].join('|'); // Use regexp to parse data const results = []; data.replace(new RegExp(`(${spl})(\\w{4})(\\w{4})(.*?)(?=${spl}|$)`, 'g'), (m,a,b,c,d) => { // Form objects and push to res results.push({ inEligType: a, msgCode: b, msgLen: c, textMsg: d }); }); // Result console.log(results);

A first approach, based on a groups capturing regex consumed by split , processed by a helper function and finally reduce d to the expected result...

 function chunkRight(arr, chunkLength) { const list = []; arr = [...arr]; while (arr.length >= chunkLength) { list.unshift( arr.splice(-chunkLength) ); } return list; } // see also... [https://regex101.com/r/tatBAB/1] // with eg // (?<inEligType>O9|EO|HJ|J8)(?<msgCode>\w{4})(?<msgLen>\d{4}) //... or... // (O9|EO|HJ|J8)(\w{4})(\d{4}) // function extractStatusItems(str, separators) { const regXSplit = RegExp(`(${ separators.join('|') })(\\w{4})(\\d{4})`); const statusValues = String(str).split(regXSplit).slice(1); const groupedValues = chunkRight(statusValues, 4); return groupedValues.reduce((list, [inEligType, msgCode, msgLen, textMsg]) => list.concat({ inEligType, msgCode, msgLen, textMsg }), [] ); } const statusCode = 'O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt'; console.log( `statusCode... ${ statusCode }...`, extractStatusItems(statusCode, ['O9', 'EO', 'HJ', 'J8']) );
 .as-console-wrapper { min-height: 100%;important: top; 0; }

... followed by a second approach, based almost entirely on a regex which captures named groups , consumed by matchAll and finally map ped into the expected result...

 // see also... [https://regex101.com/r/tatBAB/2] // with eg // (?<inEligType>O9|EO|HJ|J8)(?<msgCode>\w{4})(?<msgLen>\d{4})(.*?)(?<textMsg>.*?)(?=O9|EO|HJ|J8|$) // function extractStatusItems(str, separators) { separators = separators.join('|'); const regXCaptureValues = RegExp( `(?<inEligType>${ separators })(?<msgCode>\\w{4})(?<msgLen>\\d{4})(.*?)(?<textMsg>.*?)(?=${ separators }|$)`, 'g' ); return [...String(str).matchAll(regXCaptureValues) ].map( ({ groups }) => ({...groups }) ); } const statusCode = 'O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt'; console.log( `statusCode... ${ statusCode }...`, extractStatusItems(statusCode, ['O9', 'EO', 'HJ', 'J8']) );
 .as-console-wrapper { min-height: 100%;important: top; 0; }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM