This question is related to RegEx: Grabbing values between quotation marks , that I've tried to implement in my actual code, but with no success.
What I'd like to accomplish is to parse PHP code , and grab literal double-quoted strings inside the code.
Solutions using token_get_all()
are not valid, as the PHP code may be not parsing correctly (invalid, broken, old PHP 4 code).
The regular expression should:
To have an example of what the regexp should match, consider this parts of (ugly, old and unsecure) PHP code:
header("Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT");
$sql = "UPDATE $table_name SET
password = password('$newpass'), pchange = '1'
WHERE email = '$email'";
$var = '"' . $something . '"';
$msg = "<p><a href=\"login.html\">Login</a></p>";
echo "<label for=\"whatever\">LABEL</label><select class='".$style."'>";
The regular expression should match:
"Last-Modified: "
"D, d MYH:i:s"
" GMT"
"UPDATE $table_name SET password = password('$newpass'), pchange = '1' WHERE email = '$email'"
"<p><a href=\"login.html\">Login</a></p>"
"<label for=\"whatever\">LABEL</label><select class='"
"'>"
The regexp will be used within a preg_match()
with PREG_OFFSET_CAPTURE
, to restart the search where the last match occurred, in this way:
$string_match = preg_match(**REGEXP_HERE**, $php_code, $text_in_double_quotes, PREG_OFFSET_CAPTURE, $last_pos);
if ($string_match) {
list($text_in_double_quotes, $last_pos) = $text_in_double_quotes[0];
}
Thank you!
PS
For those asking why I'm bothering doing this, is to match unquoted array accesses inside these literal double-quoted strings and have them corrected.
For example (don't use this code, it has severe security flaws):
$sql = "SELECT * FROM table1 WHERE userid = '$_SESSION[id]'";
$sql2 = "SELECT * FROM table2 WHERE userid = '$array[key]' AND id = ".$other_array[whatever];
Will get transformed in
$sql = "SELECT * FROM table1 WHERE userid = '" . $_SESSION['id'] . "'";
$sql2 = "SELECT * FROM table2 WHERE userid = '" . $array['key'] . "' AND id = " . $other_array['whatever'];
You could use verbs (*SKIP)(*F)
to exclude single quoted substrings.
$regex = '/\'[^\'\\\]*(?:\\\.[^\'\\\]*)*\'(*SKIP)(?!)|"[^"\\\]*(?:\\\.[^"\\\]*)*"/';
See this demo at regex101 - The underlying pattern is from this answer .
To extract multiple items, use this regex with preg_match_all
like that:
if(preg_match_all($regex, $str, $out) > 0) {
print_r($out[0]);
}
Here is a PHP demo at tio.run , matches will be in $out[0]
(full pattern).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.