简体   繁体   English

我如何用这个 HTML 代码解析我想要的内容?

[英]How can i parse what i want with this HTML code?

I would like to extract the attributes "href" and "title" from this HTML code:我想从这个 HTML 代码中提取属性“href”和“title”:

<span class='ipsType_break ipsContained'>
  <a 
    href='https://www.xxxx/topic/11604/' 
    class='' title='[2002] Le Boulet [xxxx] '
    data-ipsHover data-ipsHover-target='https://www.xxxxx/topic/1160'
    data-ipsHover-timeout='1.5'>
<span>[2002] Le Boulet [xxxxx]</span>
  </a>
</span>

I tried some codes in PHP but it's not working:(我在 PHP 中尝试了一些代码,但它不起作用:(

By example举例

$e = $html->find('span[class=ipsType_break ipsContained]');
$value = $e->title;
print_r($value);

You can use a single call using find, but you have to indicate that you are looking for an anchor a , as the span here does not have a title attribute.您可以使用 find 进行单个调用,但您必须表明您正在寻找锚点a ,因为此处的跨度没有标题属性。

As find returns an array, you have to indicate that you want the first element by specifying 0 as the second argument.由于find返回一个数组,您必须通过将0指定为第二个参数来指示您想要第一个元素。

$e = $html->find('span[class=ipsType_break ipsContained] a', 0);
echo $e->href . PHP_EOL;
echo $e->title;

Output Output

https://www.xxxx/topic/11604/
[2002] Le Boulet [xxxx]

If you want to find an HTML element using its classes, you can use the dot notation like in CSS:如果您想使用其类查找 HTML 元素,可以使用dot notation ,如 CSS:

$e = $html->find('span.ipsType_break.ipsContained');

You actually want those attributes form the a elements inside the span .您实际上希望这些属性形成span内的a元素。 You are correctly finding the span tags but the statement you use returns an array of elements which you need the first one only.您正确地找到了span标签,但您使用的语句返回一个元素数组,您只需要第一个元素。 Then you should search children of this span tag to extract its first a child (again, because find returns an array of elements event if there is only one element matching your selector):然后你应该搜索这个span标签a子元素来提取它的第一个子元素(同样,如果只有一个元素匹配你的选择器, find返回一个元素数组事件):

$a = $html->find('span[class=ipsType_break ipsContained]', 0)->find('a', 0);
print_r (['title' => $a->title, 'href' => $a->href]);

the output is: output 是:

 Array ( [title] => [2002] Le Boulet [xxxx] [href] => https://www.xxxx/topic/11604/ )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM