简体   繁体   中英

Split string data into array based on new line and then double digit number

What I'm looking to do is split data from string into an array.

Here's the general idea of the text format...

xxxxx denotes any mix of alpha-numeric-whitespace data.

xxxxx
 1 xxxxxxxxxx
 2 xxxxxxxxxx
xxxxxxxxx
xxxxxxxxx
xxxxxxxx
 3 xxxxxxxxxx
 4 xxxxxxxxxx
xxxxxxxxxx
 5 xxxxxxxxxx

(When numbers get into the double digits, the ten's place goes into the blank position in-front of the number)

Now what I want to do is have an array of 5 elements (in this case), which stores the number and all data that trails (including the new lines). In the past this was not a big deal and I could use string.split("\\n") , but now I need to delimit based on some sort of regex like /\\n [0-9]{1,2}/ so I'm looking for a quick and easy way to do this (as split() doesn't support regex).

I want the array to be like

array[1] = " 1 xxxxxxxxxx"
array[2] = " 2 xxxxxxxxxxx\nxxxxxxxxxx\nxxxxxxxxxx"
array[3] = " 3 xxxxxxxxxx"
...etc

split() does support regexes. Try this:

text.split(/\n(?=[1-9 ][0-9] )/)

You can use lookahead and split on (?= [1-9] |[1-9][0-9] ) , perhaps anchored at the beginning of a line, but there may be issues with ambiguities in the xxxx part. This also doesn't ensure that the numbering is sequential.

Example

var text =
  "preface\n" +
  " 1 intro\n" +
  " 2 body\n" +
  "more body\n" +
  " 3 stuff\n" +
  "more stuff\n" +
  "even 4 stuff\n" +
  "10 conclusion\n" +
  "13 appendix\n";

print(text.split(/^(?= [1-9] |[1-9][0-9] )/m));

The output is ( as seen on ideone.com ):

preface
, 1 intro
, 2 body
more body
, 3 stuff
more stuff
even 4 stuff
,10 conclusion
,13 appendix

As @polygenelubricants said, you could use a regex with replace and make an interim delimiter, then split on that delimiter and remove it.

Here is a working example from the string you gave above and another I made to test the function. It works with both. Since you didn't provide any real data for an example, I can't test that, but hopefully this will at least get you going on the right track.

function SplitCrazyString(str) {
    var regex = /(\n\s?\d+\s[^(\n\s?\d+)]+)/mg;

    var tempStr = str.replace(regex, "~$1");

    var ary = tempStr.split('~');

    for (var i = 0; i < ary.length; i++) {
        ary[i].replace('~', '');
    }

    return ary;
}
var x = "xxxxx\n" +
    " 1 xxxxxxxxxx\n" +
    " 2 xxxxxxxxxx\n" +
    "xxxxxxxxx\n" +
    "xxxxxxxxx\n" +
    "xxxxxxxx\n" +
    " 3 xxxxxxxxxx\n" +
    " 4 xxxxxxxxxx\n" +
    "xxxxxxxxxx\n" +
    " 5 xxxxxxxxxx\n";
var testStr = "6daf sdf84 as96\n" +
    " 1 sfs 4a8dfa sf4asf\n" +
    " 2 s85 d418 df4 89 f8f\n" +
    "65a1 sdfa48 asdf61\n" +
    "w1c 987a w1ec\n" +
    "a6s85 d1a6f 81sf\n" +
    " 3 woi567 34ewn23 5cwe6\n" +
    " 4 s6k 8hf6 9gd\n" +
    "axxm4x1 dsf615g9 8asdf1jt gsdf8as\n" +
    " 5 n389h c8j923hdha 8h3x982qh\n";

var xAry = SplitCrazyString(x);
var testAry = SplitCrazyString(testStr);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM