OK, I need to scan many HTML / XHTML documents to see if a particular file has been embedded with SWFObject. If it's the case, I need to replace the call to something else.
So far I have extracted the <script>
contents where the calls can be made. Now I need to scan this string to check if the call is there and if it's there I need to replace it.
I know this is a bit odd, but the content comes from a third party which we don't have control on.
Since the call can be made in many different syntax, I will need a regular expression to find and replace the calls.
OK imagine the following scenario:
I'm searching if the file test.swf
is embedded with SWFObject in the file.
The <script>
content look like this:
alert('test.swf');
//some other random stuff here
swfobject.embedSWF("test.swf",
"The alternative content can screw the regexp with );", "300", "120",
"9.0.0", false, flashvars, params, attributes);
Now I would like to replace swfobject.embedSWF
(and all parameters) to something else.
Is there a not too horrible way to do this? Don't forget that the call can be on one or many lines, that the parameters can be wrapped with single quotes (') or double quotes ("), that whitespace can be all around...
EDIT: OK since catching all kind of JS syntax is a bit overkill I will simplify the requirement:
The regular expression can assume only the following
swfobject.embedSWF
(case sensitive) (
"
or a '
(either one but one of the 2 is required) "
or '
(if we can ensure that it's the same char that in 4 good if not too bad) ,
)
then any whitespaces (or not) then ;
then an end of line
. It should be much simpler to parse this way (I guess).
EDIT 2: I've cooked a solution. I think I'm close but it's not working, Anyone can help? 0 should match but it's not...
<?php
$myFilename = 'test.swf';
$testCases = array();
$testCases[] = 'swfobject.embedSWF("test.swf", "The alternative content can screw the regexp with );", "300", "120", "9.0.0", false, flashvars, params, attributes);';
foreach ($testCases as $i => $currTest)
{
$currResult = preg_match('/\s*swfobject\.embedSWF\s*\(\s*(["\'])(' . preg_quote($myFilename) . ')[^"\']+\1\s*,[\s\S]+?\)\s*;\s*$/', $currTest);
if ($currResult === false || $currResult < 1)
echo $i, ' Not matching', PHP_EOL;
else
echo $i, ' Matching', PHP_EOL;
}
?>
Well, somebody had the time to write a basic javascript parser in PHP. I'd give the tokenizer a try (possibly using an HTML parser to first find the <script> nodes).
Use 'grep' or similar on the command line to get a list of files that contain the .swf/script/object strings you need. That'll whittle down the number of files you need to process.
Then, use a PHP script to slurp each of those files into the DOM parser of your choice and do the replacing/fixing-up there.
In regards of your EDIT2 ...
I'm not the best with regular expressions but you can try:
$currResult = preg_match('/\s*swfobject\.embedSWF\s*\(\s*(["\'])(' . preg_quote($myFilename) . ')\1\s*,[\s\S]+?\)\s*;\s*$/', $currTest);
Seems to work OK for me.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.