简体   繁体   中英

Split String into Array with RegEx Pattern

I have a string that I want to split into an array. The string looks like this:

'O:BED,N:KET,OT,N:JAB,FA,O:RPT,'

The string can contain any number of objects eg

'O:BED,N:KET,OT,N:JAB,FA,O:RPT,X:BLA,GTO'

I want to split this string on the instance of \\w: eg O:

So I'll end up with array like this:

['O:BED','N:KET, OT','N:JAB,FA','O:RPT']

I am using the following code:

var array = st.split(/^(\w:.+)(?=\w:)/g);

However I end up with array like this :

['','O:BED,N:KET,OT,N:JAB,FA,','O:RPT,']

It seems the regex is being greedy, what should I do to fix it?

Note I am using angularjs and eventually I want to end up with this :

   var objs = [
     {type:O,code: BED, suf: ''},
     {type:N, code: KET, suf: OT},
     {type:N, code: JAB, suf: FA},
     {type:O, code: RPT, suf: ''}
     ]

It would be much easier if your string is formatted properly. But still we can achieve the task with extra effort. Hope the below code works for you.

var str = 'O:BED,N:KET,OT,N:JAB,FA,O:RPT,X:BLA,GTO';

var a = str.split(',');
var objs = [], obj, item, suf;

for(var i=0; i<a.length;){
  item = a[i].split(':');

  if(a[i+1] && a[i+1].indexOf(':') == -1){
    suf = a[i+1];
    i++;
  }else{
    suf = "";
  }

  obj = {
    type: item[0],
    code: item[1],
    suf: suf
  };

  objs.push(obj);
  i++;
}

console.log(objs);

You can use the RegExp.prototype.exec method to obtain successive matches instead of splitting the string with a delimiter:

var myStr = 'O:BED,N:KET,OT,N:JAB,FA,O:RPT,';
var myRe = /([^,:]+):([^,:]+)(?:,([^,:]+))??(?=,[^,:]+:|,?$)/g;
var m;
var result = [];

while ((m = myRe.exec(myStr)) !== null) {
  result.push({type:m[1], code:m[2], suf:((m[3])?m[3]:'')});
}

console.log(result);

You want to do a string match and then iterate over that.

Full example inside AngularJS: http://jsfiddle.net/184cyspg/1/

var myString = 'O:BED,N:KET,OT,N:JAB,FA,O:RPT,';
$scope.myArray = [];
var objs = myString.match(/([A-Z])\:([A-Z]*)\,([A-Z]?)/g);
objs.forEach(function (entry) {
    var obj = entry.replace(',', ':');
    obj = obj.split(':');
    $scope.myArray.push({type: obj[0], code: obj[1], suf: obj[2]});
});

I love regular expressions :)

This will match each object of your string, if you want to use the global flag and exec() through all the matches:

(\w):(\w+)(?:,((?!\w:)\w+))?

The only real trick is to only treat the next bit after the comma as the suffix to this one if it doesn't look like the type of the next.

Each match captures the groups:

  1. type
  2. code
  3. suf

If you just want to split as you said, then the solution to your greedy problem is to tell it to split on commas which are followed by those matching objects, eg:

,(?=(\w):(\w+)(?:,((?!\w:)\w+))?)

The following does not solve your regex issue however is an alternative approach to introduce underscorejs to handle from simple to more complex operations. Although an overkill in this case;

// ie. input string = 'O:BED,N:KET,OT,N:JAB,FA,O:RPT,';
.controller('AppCtrl', [function() {
    /**
     * Split by comma then (chain) eval each (map) 
     * element that (if-else) contains '0:' is pushed 
     * into array as a new element, otherwise concat element
     * 
     * :#replace hardcoded values with params
     *
     * @param String string - a string to split
     * @param String prefix - prefix to determine start of new array element ie. '0:'
     * @param String delimiter - delimiter to split string ie ','
     * @return Array array of elements by prefix
     */
    $scope.splitter = function(string) {
      var a = [];
      var tmp = "";

      _.chain(string.split(',')) 
        .map(function(element) {
          if(element.indexOf('O:') >= 0) {
            element += tmp;
            a.push(element);
            tmp = "";
          } else {
            tmp += element;
          }
        });

      return a;
    };
}]);

Output:

array: Array[2]
  0: "O:BED"
  1: "O:RPTN:KETOTN:JABFA"
length: 2

Updated: Just read your requirements on Objects. underscorejs allows chaining operations. For example, the code above could be tweaked to handle Objects, chained to .compact().object().value() to produce output as Object k:v pairs;

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM