简体   繁体   English

DOM Scrape无法正常工作的PHP

[英]DOM Scrape not working PHP

I was just wondering why this isnt working for me. 我只是想知道为什么这对我不起作用。 What i want to do is strip out the m4v file. 我想要做的是删除m4v文件。 I have a similar script working for images on my site that will strip the image, upload to dir and database and link. 我的网站上有一个类似的脚本用于处理图像,该脚本将剥离图像,然后将其上传到目录和数据库并进行链接。 but i cant get this to work the same way. 但是我不能以相同的方式来工作。 Thanks for your help 谢谢你的帮助

<?php

include('simple_html_dom.php');

$html = file_get_html("http://www.mysitesvids.com/m/videos/view/36821");
$element = $html->find("file:");
$result = $element->innertext;

?>

This is the code from the site 这是网站上的代码

<script type="text/javascript" language="javascript">
jwplayer ('embedFlashPlayer').setup         ({flashplayer:'/swf/jwplayer5.swf',id:'moviePlayer',width:602,height:404,
    file:'http://davesvideos.mysitevids.com/media/b0e9ec18eb567ce41dce906cee7e1c9f/4fcbb164/videos/m/634276.m4v',
image:'/media/80eb2eaca3c58f002be8ab5bda476e91/4fcbb164/videos/p/64/634276.jpg',
provider:'http',controlbar:'bottom',stretching:'uniform',abouttext:'mysite',aboutlink:'http://www.eroprofile.com/'});

glbUpdViews ('0','634276','0','0');
ajaxActive = false;
cmtLoad ('video', '634276', '', '');
ajaxActive = false;
cmtReply ('video', '634276', '0');


</script>

From docs of SimpleHtmlDom, find() matches only html elements, hence you cannot search for 'file:' using find() , you can probably do this: 从SimpleHtmlDom的文档中, find()仅匹配html元素,因此您无法使用find()搜索“ file:”,您可以执行以下操作:

$script = $html->find('script')->innertext

and apply a regular expression to match *.mv4 file on $script . 并应用正则表达式以匹配$script上的* .mv4文件。

alternatively, you can apply the regex matching directly on contents of file. 或者,您可以将正则表达式匹配直接应用于文件的内容。

This would be easier solved with a regular expression: 使用正则表达式可以更轻松地解决此问题:

preg_match( "/file:'(.+?)'/", $html, $matches );

if ( $matches ) {
    echo $matches[1];
}

I'm assuming you don't have other instances of this string pattern on the page. 我假设您在页面上没有此字符串模式的其他实例。 if you did, and you only wanted to match the m4v's, you could modify the expression to look for that extension: 如果这样做,并且只想匹配m4v,则可以修改表达式以查找该扩展名:

preg_match( "/file:'(.+?\.m4v)'/", $html, $matches );

if ( $matches ) {
    echo $matches[1];
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM