简体   繁体   中英

regex: extract location name and last integer from string

I have the following array of strings. The names are locations, and each location has 4 integers "attached" to it.

Using regex (in nodeJS, with javascript), I am trying to extract the name of the location , and the last (4th) of the integers for each location.

[ '          UNICENTRO CALI                                               1131908       296780       133622       968750',
  '          PASTO 2                                                      1044057       212780       133004       964281',
  '          CALIMA                                                       1397254       311214       173761     1259801',
  '          PALMIRA2                                                       922857       272954       103978       753881',
  '          PEREIRA CRA 6                                                1188885       157589       165004     1196300',
  '          DE LA CUESTA-BUCARAMANGA                                       219916        49526        27261       197651' ]

for example, for the first location I would need to fish out "UNICENTRO CALI" and "968750".

To do this, I've tried:

myArray[i].split("                              ")

This separates the name of the location from the four integers, but this will turn into an inefficient mess.

Any chance somebody can do it elegantly with a regular expression?

This will capture all your columns:

/'\s+(.*\S)?\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)'/

capture group 1 = location
capture group 2 = num 1
capture group 3 = num 2
capture group 4 = num 3
capture group 5 = num 4

var str = "'          UNICENTRO CALI                                          1131908       296780       133622       968750'";
var arr = /'\s+(.*\S)?\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)'/.exec(str);


> console.log(arr)
[Log] Array (6)
0"' UNICENTRO CALI 1131908 296780 133622 968750'"
1"UNICENTRO CALI"
2"1131908"
3"296780"
4"133622"
5"968750"
Array Prototype

Your data changed, use this:

/'(.*\S)\s+([\d,]+)\s+([\d,]+)\s+([\d,]+)\s+([\d,]+)'/

https://regex101.com/r/jJ6xM7/2

Give this a try: /^'\\s+(\\w+ +\\w*)( +\\d+){3} +(\\d+)'/

Where $1 (group 1) is your location and $3 (group 3) is the last set of integers on each line.

As I mentioned, your data from the original post changed. Use ergonaut's recommended expression: /'(.*\\S)\\s+([\\d,]+)\\s+([\\d,]+)\\s+([\\d,]+)\\s+([\\d,]+)'/

If you aren't specifically looking for a Regex to parse your entire data, here's one way to do it easily:

var a = [ 'Total C.O.          UNICENTRO CALI                                               1,131,908       296,780       133,622       968,750',
  'Total C.O.          PLAZA CAICEDO                                                  988,721       272,182       114,641       831,180',
  'Total C.O.          COSMOCENTRO                                                    692,679       159,488        85,309       618,500',
  'Total C.O.          PASTO 2                                                      1,044,057       212,780       133,004       964,281'];

var b = [];
a.forEach(function(item){
    var splitItem = item.split(/\s\s+/),
    len = splitItem.length;
    b.push({"name":splitItem[1], "value":splitItem[len-1]});
});
console.log(b);

I used the data from your Regex101 link to demonstrate in this jsFiddle .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM