简体   繁体   中英

Javascript Regular Expression: Only matching the last pattern

Context: I have some dynamically generated HTML which can have embedded javascript function calls inside. I'm trying to extract the function calls with a regular expression.

Sample HTML string:

 <dynamic html>

   <script language="javascript">
       funcA();
   </script>

 <a little more dynamic html>

   <script language="javascript">
       funcB();
   </script>

My goal is to extract the text "funcA();" and "funcB();" from the above snippet (either as a single string or an array with two elements would be fine). The regular expression I have so far is:
var regexp = /[\\s\\S]*<script .*>([\\s\\S]*)<\\/script>[\\s\\S]*/gm;

Using html_str.replace(regexp, "$1") only returns "funcB();".

Now, this regexp works just fine when there is only ONE set of <script> tags in the HTML, but when there are multiple it only returns the LAST one when using the replace() method. Even removing the '/g' modifier matches only the last function call. I'm still a novice to regular expressions so I know I'm missing something fundamental here... Any help in pointing me in the right direction would be greatly appreciated. I've done a bit of research already but still haven't been able to get this issue resolved.

Your wildcard matches are all greedy. This means they will not only match what you expect, but as much as there possibly is in your code.

Make them all non-greedy ( .*? ) and it should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM