简体   繁体   中英

DOM Scrape not working PHP

I was just wondering why this isnt working for me. What i want to do is strip out the m4v file. I have a similar script working for images on my site that will strip the image, upload to dir and database and link. but i cant get this to work the same way. Thanks for your help

<?php

include('simple_html_dom.php');

$html = file_get_html("http://www.mysitesvids.com/m/videos/view/36821");
$element = $html->find("file:");
$result = $element->innertext;

?>

This is the code from the site

<script type="text/javascript" language="javascript">
jwplayer ('embedFlashPlayer').setup         ({flashplayer:'/swf/jwplayer5.swf',id:'moviePlayer',width:602,height:404,
    file:'http://davesvideos.mysitevids.com/media/b0e9ec18eb567ce41dce906cee7e1c9f/4fcbb164/videos/m/634276.m4v',
image:'/media/80eb2eaca3c58f002be8ab5bda476e91/4fcbb164/videos/p/64/634276.jpg',
provider:'http',controlbar:'bottom',stretching:'uniform',abouttext:'mysite',aboutlink:'http://www.eroprofile.com/'});

glbUpdViews ('0','634276','0','0');
ajaxActive = false;
cmtLoad ('video', '634276', '', '');
ajaxActive = false;
cmtReply ('video', '634276', '0');


</script>

From docs of SimpleHtmlDom, find() matches only html elements, hence you cannot search for 'file:' using find() , you can probably do this:

$script = $html->find('script')->innertext

and apply a regular expression to match *.mv4 file on $script .

alternatively, you can apply the regex matching directly on contents of file.

This would be easier solved with a regular expression:

preg_match( "/file:'(.+?)'/", $html, $matches );

if ( $matches ) {
    echo $matches[1];
}

I'm assuming you don't have other instances of this string pattern on the page. if you did, and you only wanted to match the m4v's, you could modify the expression to look for that extension:

preg_match( "/file:'(.+?\.m4v)'/", $html, $matches );

if ( $matches ) {
    echo $matches[1];
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM