简体   繁体   中英

Make RegEx groups to split line into columns

I've been having trouble with some comma based RegEx creation. In the following structure the first 19 columns should be split only by commas, the next 3 columns have { } but inside these brackets I can have more brackets (it's a "script block"). So for the last 3 I want to have everything inside ,{}

This is the structure

ID,AegisName,Name,Type,Buy,Sell,Weight,ATK[:MATK],DEF,Range,Slots,Job,Class,Gender,Loc,wLV,eLV[:maxLevel],Refineable,View,{ Script },{ OnEquip_Script },{ OnUnequip_Script }

With this for example

1624,Lich_Bone_Wand,Lich's Bone Wand,5,20,,800,60:170,,1,2,0x00018314,18,2,2,3,70,1,10,{ bonus bInt,1; bonus bDex,1; bonus bAtkEle,Ele_Undead; .@r = getrefine(); bonus3 bAutoSpellWhenHit,"NPC_WIDECURSE",5,10+.@r; if(.@r>=9){ bonus bMatkRate,3; bonus bMaxSP,300; } },{},{}

I've found this ([^\,]*),"x(19)."(\{.*\}),"x(2)."(\{.*\}) but it's in Perl and I couldn't translate to JavaScript. I can see that if I combine (\{.*\}) three times (like this (\{.*\}),(\{.*\}),(\{.*\}) , it will get me the last 3 columns and this ([^\,]*), will get me the first columns split correctly but also interfere with the last ones and so I tried "limiting" it to the first 19 occurences but if I do ([^\,]*),{19} it won't work

How would I accomplish this?

There is more than one way to accomplish this using a combination of replace & split:

  1. temporarily replace commas within {...} , split on commas, restore the commas within each array item
  2. split on commas, then combine array items from first occurrence of { to last occurrence of } , keeping track of nesting
  3. do a split with negative lookahead to avoid a split on commas within {...}

Here is an example for the first option, where we temporarily replace commas within {...} :

function properSplit(line) {
    return line
    .replace(/(\{[^,]*,.*?\})(?=,)/g, function(m, p1) {
        return p1.replace(/,/g, '\x01');
    })
    .split(/,/)
    .map(function(item) {
        return item.replace(/\x01/g, ',');
    });
}

var str = "1624,Lich_Bone_Wand,Lich's Bone Wand,5,20,,800,60:170,,1,2,0x00018314,18,2,2,3,70,1,10,{ bonus bInt,1; bonus bDex,1; bonus bAtkEle,Ele_Undead; .@r = getrefine(); bonus3 bAutoSpellWhenHit,\"NPC_WIDECURSE\",5,10+.@r; if(.@r>=9){ bonus bMatkRate,3; bonus bMaxSP,300; } },{},{}";
console.log(JSON.stringify(properSplit(str), null, ' '));

Output:

[
 "1624",
 "Lich_Bone_Wand",
 "Lich's Bone Wand",
 "5",
 "20",
 "",
 "800",
 "60:170",
 "",
 "1",
 "2",
 "0x00018314",
 "18",
 "2",
 "2",
 "3",
 "70",
 "1",
 "10",
 "{ bonus bInt,1; bonus bDex,1; bonus bAtkEle,Ele_Undead; .@r = getrefine(); bonus3 bAutoSpellWhenHit,\"NPC_WIDECURSE\",5,10+.@r; if(.@r>=9){ bonus bMatkRate,3; bonus bMaxSP,300; } }",
 "{}",
 "{}"
]

Explanation:

  • The first replace() replaces commas within {...} with a non-printable character '\x01' . It scans non-greedily to the next }, pattern, where the , is a positive lookahead
  • the split() now misses the commas within {...}
  • the map() restored the non-printable chars to commas

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM