简体   繁体   English

从Python脚本解析PHP文件变量

[英]Parse PHP file variables from Python script

I need to get some data from PHP(Wordpress) config files from my Python script. 我需要从Python脚本的PHP(Wordpress)配置文件中获取一些数据。 How I can parse config data? 如何解析配置数据? For example, how I can get $wp_version value? 例如,如何获取$ wp_version值? Config example: 配置示例:

/**
 * The WordPress version string
 *
 * @global string $wp_version
 */
$wp_version = '3.5.1';

/**
 * Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.
 *
 * @global int $wp_db_version
 */
$wp_db_version = 22441;

/**
 * Holds the TinyMCE version
 *
 * @global string $tinymce_version
 */
$tinymce_version = '358-23224';

/**
 * Holds the required PHP version
 *
 * @global string $required_php_version
 */
$required_php_version = '5.2.4';

/**
 * Holds the required MySQL version
 *
 * @global string $required_mysql_version
 */
$required_mysql_version = '5.0';

$wp_local_package = 'en_EN';

You know that a simple variable in PHP is like $foo = 'bar'; 您知道PHP中的一个简单变量就像$foo = 'bar'; , let's create a regex that does not take in account something like $_GET or $foo['bar'] : ,让我们创建一个不考虑$_GET$foo['bar']类的正则表达式:

  1. Start with $ , note that we need to escape it: $开头,请注意,我们需要对其进行转义:
    \\$
  2. The first character after $ can't be a number and has to be a letter or underscore: $之后的第一个字符不能是数字,而必须是字母或下划线:
    \\$[az]
  3. Then there may be a letter or digits or underscore after it: 然后可能会有字母,数字或下划线:
    \\$[az]\\w*
  4. Let's put the parenthesis: 让我们加上括号:
    \\$([az]\\w*)
  5. Now then there should be the "equal sign", but to make it more compatible, let's make the spaces optional: 现在应该有“等号”,但是为了使其更加兼容,让我们将空格设为可选:
    \\$([az]\\w*)\\s*=\\s*
  6. After this there should be a value and it ends with a ; 在这之后应该有一个值,并以;结尾; :
    \\$([az]\\w*)\\s*=\\s*(.*?);$
  7. We will use the m modifier which make ^$ match start and end of line respectively. 我们将使用m修饰符,使^$匹配行的开始和结束。
  8. You can then use a trimming function to get ride of the single and double quotes. 然后,您可以使用修整功能来获取单引号和双引号。

Online demo 在线演示

Note 1: This regex will fail at nested variables $fail = 'en_EN'; 注1:此正则表达式将在嵌套变量$ fail ='en_EN'时失败; $fail2 = 'en_EN'; $ fail2 ='en_EN';
Note 2: Don't forget to use the i modifier to make it case insensitive. 注意2:不要忘记使用i修饰符使其不区分大小写。

I've written a little python script to get pull database login information from wordpress's wp-config.php file for doing automatic site backups. 我编写了一些python脚本,以从wordpress的wp-config.php文件中获取拉数据库登录信息,以进行自动站点备份。

Here is the relevant part of my code (GitHub's syntax highlighting has trouble with Python's triple quoted strings): 这是我的代码的相关部分(GitHub的语法突出显示在使用Python的三引号引起的字符串时遇到了麻烦):

#!/usr/bin/env python3
import re

define_pattern = re.compile(r"""\bdefine\(\s*('|")(.*)\1\s*,\s*('|")(.*)\3\)\s*;""")
assign_pattern = re.compile(r"""(^|;)\s*\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)\s*=\s*('|")(.*)\3\s*;""")

php_vars = {}
for line in open("wp-config.php"):
  for match in define_pattern.finditer(line):
    php_vars[match.group(2)]=match.group(4)
  for match in assign_pattern.finditer(line):
    php_vars[match.group(2)]=match.group(4)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM