简体   繁体   English

正则表达式在PHP中提取JavaScript变量

[英]Regular expression extract a JavaScript variable in PHP

I have a large HTML file, containing a lot of content. 我有一个大的HTML文件,包含很多内容。 I want to get a JavaScript variable, named 'a' for example, from the whole file. 我想从整个文件中获取一个名为'a'的JavaScript变量。

Example: (deleted lots of the actual content) 示例:(删除了大量实际内容)

<html>
    <head>
        <script>
            var a = [{'a': 1, 'b': 2}];
        </script>
    </head>
    <body>
        ....
    </body>
</html>

What should come from the above is: 应该从上面得到的是:

[{'a': 1, 'b': 2}]
preg_match('#var a = (.*?);\s*$#m', $html, $matches);
echo $matches[1];

Explanation: 说明:

  • Regex will try to match any line containing var a = 正则表达式将尝试匹配包含var a =任何行
  • It will then match everything up until a ; 然后它将匹配所有内容直到a ; , any amount of spaces \\s* , then the end of the line $ ,任意数量的空格\\s* ,然后是行的结尾$
  • The m modifier will try to match each line independently, without it, the $ would just match then end of the string which would be a bit useless m修饰符将尝试独立匹配每一行,没有它, $只会匹配字符串的结尾,这将是有点无用的

The any amount of spaces is only there in case you have some spaces after the definition, no other reason (eg human error). 任何数量的空格只有在定义后有一些空格的情况下,没有其他原因(例如人为错误)。 If you're sure that won't happen, you can remove \\s* . 如果您确定不会发生这种情况,可以删除\\s*

Note that this doesn't replace a full-blown parser. 请注意,这不会取代完整的解析器。 You will need to make modifications if a is defined over more than one line, if a is defined more than once (think about scope, you can have var a on a global scope, then var a within a function), etc. 您将需要进行修改,如果a是在多个行定义,如果a被定义不止一次(想想范围,你可以有var a在全球范围内,则var a函数内)等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM