简体   繁体   中英

How to extract a string using JavaScript Regex?

I'm trying to extract a substring from a file with JavaScript Regex. Here is a slice from the file :

DATE:20091201T220000
SUMMARY:Dad's birthday

the field I want to extract is "Summary". Here is the approach:

extractSummary : function(iCalContent) {
  /*
  input : iCal file content
  return : Event summary
  */
  var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
  return(arr);
}
function extractSummary(iCalContent) {
  var rx = /\nSUMMARY:(.*)\n/g;
  var arr = rx.exec(iCalContent);
  return arr[1]; 
}

You need these changes:

  • Put the * inside the parenthesis as suggested above. Otherwise your matching group will contain only one character.

  • Get rid of the ^ and $ . With the global option they match on start and end of the full string, rather than on start and end of lines. Match on explicit newlines instead.

  • I suppose you want the matching group (what's inside the parenthesis) rather than the full array? arr[0] is the full match ( "\\nSUMMARY:..." ) and the next indexes contain the group matches.

  • String.match(regexp) is supposed to return an array with the matches. In my browser it doesn't (Safari on Mac returns only the full match, not the groups), but Regexp.exec(string) works.

You need to use the m flag :

multiline; treat beginning and end characters (^ and $) as working over multiple lines (ie, match the beginning or end of each line (delimited by \\n or \\r), not only the very beginning or end of the whole input string)

Also put the * in the right place:

"DATE:20091201T220000\r\nSUMMARY:Dad's birthday".match(/^SUMMARY\:(.*)$/gm);
//------------------------------------------------------------------^    ^
//-----------------------------------------------------------------------|

Your regular expression most likely wants to be

/\nSUMMARY:(.*)$/g

A helpful little trick I like to use is to default assign on match with an array.

var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value
return arr[0];

This way you don't get annoying type errors when you go to use arr

(.*) instead of (.)* would be a start. The latter will only capture the last character on the line.

Also, no need to escape the : .

This code works:

 let str = "governance[string_i_want]"; let res = str.match(/[^governance\\[](.*)[^\\]]/g); console.log(res);

res will equal "string_i_want". However, in this example res is still an array, so do not treat res like a string.

By grouping the characters I do not want, using [^string], and matching on what is between the brackets, the code extracts the string I want!

You can try it out here: https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_match_regexp

Good luck.

You should use this :

var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
return(arr[0]);

this is how you can parse iCal files with javascript

    function calParse(str) {

        function parse() {
            var obj = {};
            while(str.length) {
                var p = str.shift().split(":");
                var k = p.shift(), p = p.join();
                switch(k) {
                    case "BEGIN":
                        obj[p] = parse();
                        break;
                    case "END":
                        return obj;
                    default:
                        obj[k] = p;
                }
            }
            return obj;
        }
        str = str.replace(/\n /g, " ").split("\n");
        return parse().VCALENDAR;
    }

    example = 
    'BEGIN:VCALENDAR\n'+
    'VERSION:2.0\n'+
    'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+
    'BEGIN:VEVENT\n'+
    'DTSTART:19970714T170000Z\n'+
    'DTEND:19970715T035959Z\n'+
    'SUMMARY:Bastille Day Party\n'+
    'END:VEVENT\n'+
    'END:VCALENDAR\n'


    cal = calParse(example);
    alert(cal.VEVENT.SUMMARY);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM