![](/img/trans.png)
[英]Split String Into Array Elements Based On Punctuation In String - JavaScript
[英]How to split the String based on array elements into array retaining the array the split word in javascript
我有一個字符串
sReasons = "O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt";
我需要根據分隔符數組拆分上面的字符串
const separator = ["O9", "EO", "HJ", "J8"];
其中前 2 個字符 (O9) 代表網絡代碼,接下來 4 個另一個代碼 (C270) & 下 4 個字符 (0021) 不符合 SDWC 條件的字符串長度
其中分隔碼是唯一的,有2個大寫字母,除inEligType
textMessage
中不會重復
我需要創建一個格式為 json
{
{inEligType: "O9", msgCode: "C270", msgLen: "0021", textMsg: "Not eligible for SDWC"},
{inEligType: "EO", msgCode: "C390", msgLen: "0015", textMsg: "Service upgrade"},
{inEligType: "HJ", msgCode: "C390", msgLen: "0015", textMsg: "Service upgrade"},
{inEligType: "J8", msgCode: "C500", msgLen: "0016", textMsg: "Delivery Attempt"}
}
我基本上沒有根據給定的數組拆分字符串本身,我嘗試了以下
sReasons = "O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt";
const separator = ["O9", "EO", "HJ", "J8"];
function formatReasons(Reasons: string) {
var words: any[] = Reasons.split(this.spearator);
for(let word in words)
{
console.log(word) ;
}
}
var result = formatReasons(sHdnReasonsCreate);
console.log("Returned Result: "+result);
但它給了我結果
["O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt"]length: 1__proto__: Array(0)
Returned Address is: undefined
我的基於正則表達式的方法:
sReasons = "O9C2700021Not eligible for SDWCEOC0900015Service upgradeHJC3900015Service upgradeJ8C5HJ0016Delivery Attempt";
const separator = ["O9", "EO", "HJ", "J8"];
// build the regex based on separators
let regexPattern = '^';
separator.forEach(text => {
regexPattern += `${text}(.*)`;
});
regexPattern += '$';
// match the reasons
let r = new RegExp(regexPattern);
let matches = sReasons.match(r);
// prepare to match each message
let msgMatcher = new RegExp('^(?<msgCode>.{4})(?<msgLen>.{4})(?<textMsg>.*)$');
let output = [];
for (let i=1; i<matches.length; i++) {
// match the message
const msg = matches[i].match(msgMatcher);
// store
let item = msg.groups;
item.inEligType = separator[i-1];
output.push(item);
}
console.log(JSON.stringify(output, null, 2));
生產
[
{
"msgCode": "C270",
"msgLen": "0021",
"textMsg": "Not eligible for SDWC",
"inEligType": "O9"
},
{
"msgCode": "C090",
"msgLen": "0015",
"textMsg": "Service upgrade",
"inEligType": "EO"
},
{
"msgCode": "C390",
"msgLen": "0015",
"textMsg": "Service upgrade",
"inEligType": "HJ"
},
{
"msgCode": "C5HJ",
"msgLen": "0016",
"textMsg": "Delivery Attempt",
"inEligType": "J8"
}
]
很可能textMsg
字段或任何其他字段永遠不會包含您用於inEligType
字段的兩個字母字符串。 但你絕對確定嗎? 在我看來,數據格式確實希望有人通過特定長度的子字符串來解析它; 如果您可以根據分隔符進行拆分,為什么還要有一個msgLen
字段? 如果將來inEligType
代碼列表發生變化怎么辦?
由於這些原因,我強烈建議您通過 substring 長度而不是分隔符匹配來解析。 這是一種可能的方法:
function formatReasons(reasons: string) {
const ret = []
while (reasons) {
const inEligType = reasons.substring(0, 2);
reasons = reasons.substring(2);
const msgCode = reasons.substring(0, 4);
reasons = reasons.substring(4);
const msgLen = reasons.substring(0, 4);
reasons = reasons.substring(4);
const textMsg = reasons.substring(0, +msgLen);
reasons = reasons.substring(+msgLen);
ret.push({ inEligType, msgCode, msgLen, textMsg });
}
return ret;
}
您可以驗證它是否為您的示例sReasons
字符串生成了預期的 output:
const formattedReasons = formatReasons(sReasons);
console.log(JSON.stringify(formattedReasons, undefined, 2));
/* [
{
"inEligType": "O9",
"msgCode": "C270",
"msgLen": "0021",
"textMsg": "Not eligible for SDWC"
},
{
"inEligType": "EO",
"msgCode": "C090",
"msgLen": "0015",
"textMsg": "Service upgrade"
},
{
"inEligType": "HJ",
"msgCode": "C390",
"msgLen": "0015",
"textMsg": "Service upgrade"
},
{
"inEligType": "J8",
"msgCode": "C5HJ",
"msgLen": "0016",
"textMsg": "Delivery Attempt"
}
] */
請注意,上面的實現不檢查字符串的格式是否正確; 現在,如果你把垃圾送進去,你就會把垃圾拿出來。 如果您想要更多的安全性,您可以進行運行時檢查並拋出錯誤,例如,意外地運行reasons
字符串的末尾,或者找到不代表數字的msgLen
字段。 並且可以進行重構,這樣就不會重復像const s = reasons.substring(0, n); reasons = reasons.substring(n)
這樣的代碼。 const s = reasons.substring(0, n); reasons = reasons.substring(n)
。 但是基本算法就在那里。
RegExp 的另一種選擇,代碼更少
// Your data const data = "O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt"; // Set your data splitters from array const spl = ["O9", "EO", "HJ", "J8"].join('|'); // Use regexp to parse data const results = []; data.replace(new RegExp(`(${spl})(\\w{4})(\\w{4})(.*?)(?=${spl}|$)`, 'g'), (m,a,b,c,d) => { // Form objects and push to res results.push({ inEligType: a, msgCode: b, msgLen: c, textMsg: d }); }); // Result console.log(results);
第一種方法,基於捕獲split
消耗的正則表達式的組,由助手 function 處理,最后reduce
d 減少到預期結果......
function chunkRight(arr, chunkLength) { const list = []; arr = [...arr]; while (arr.length >= chunkLength) { list.unshift( arr.splice(-chunkLength) ); } return list; } // see also... [https://regex101.com/r/tatBAB/1] // with eg // (?<inEligType>O9|EO|HJ|J8)(?<msgCode>\w{4})(?<msgLen>\d{4}) //... or... // (O9|EO|HJ|J8)(\w{4})(\d{4}) // function extractStatusItems(str, separators) { const regXSplit = RegExp(`(${ separators.join('|') })(\\w{4})(\\d{4})`); const statusValues = String(str).split(regXSplit).slice(1); const groupedValues = chunkRight(statusValues, 4); return groupedValues.reduce((list, [inEligType, msgCode, msgLen, textMsg]) => list.concat({ inEligType, msgCode, msgLen, textMsg }), [] ); } const statusCode = 'O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt'; console.log( `statusCode... ${ statusCode }...`, extractStatusItems(statusCode, ['O9', 'EO', 'HJ', 'J8']) );
.as-console-wrapper { min-height: 100%;important: top; 0; }
...其次是第二種方法,幾乎完全基於捕獲命名組的正則表達式,由matchAll
消耗,最后map
進入預期結果...
// see also... [https://regex101.com/r/tatBAB/2] // with eg // (?<inEligType>O9|EO|HJ|J8)(?<msgCode>\w{4})(?<msgLen>\d{4})(.*?)(?<textMsg>.*?)(?=O9|EO|HJ|J8|$) // function extractStatusItems(str, separators) { separators = separators.join('|'); const regXCaptureValues = RegExp( `(?<inEligType>${ separators })(?<msgCode>\\w{4})(?<msgLen>\\d{4})(.*?)(?<textMsg>.*?)(?=${ separators }|$)`, 'g' ); return [...String(str).matchAll(regXCaptureValues) ].map( ({ groups }) => ({...groups }) ); } const statusCode = 'O9C2700021Not eligible for SDWCEOC3900015Service upgradeHJC3900015Service upgradeJ8C5000016Delivery Attempt'; console.log( `statusCode... ${ statusCode }...`, extractStatusItems(statusCode, ['O9', 'EO', 'HJ', 'J8']) );
.as-console-wrapper { min-height: 100%;important: top; 0; }
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.