使用 javascript 提取 DOM HTML

Question

I'm trying to extract information from an html page, specifically I'm trying to get the "src" url below in bold (**)我正在尝试从 html 页面中提取信息，特别是我正在尝试以粗体 (**) 获取下面的“src”url

<div class="row d-block align-end" data-v-1a4e2f4c="">
<div index="0" class="row" data-v-744a5232="" data-v-1a4e2f4c="">
    <div class="col-lg-2 col-3" data-v-744a5232=""><a
            href="/libro-el-peligro-de-estar-cuerda/9788432240645/12789134" title="EL PELIGRO DE ESTAR CUERDA"
            class="py-2" data-v-744a5232=""><img title="el peligro de estar cuerda-9788432240645"
                alt="el peligro de estar cuerda-9788432240645"
                **src="https://imagessl5.casadellibro.com/a/l/t1/45/9788432240645.jpg"**
                data-src="https://imagessl5.casadellibro.com/a/l/t1/45/9788432240645.jpg" width="" height=""
                class="show-shadow cdl-img active" style="max-height:undefinedpx;max-width:undefinedpx;"
                data-v-744a5232=""></a></div>

The code I'm using does not seem to do the trick, although it works when extracting the title:我使用的代码似乎没有解决问题，尽管它在提取标题时有效：

  let resultImg = xmlDoc.evaluate('./div/div/a/img[@src]', node, null, XPathResult.FIRST_ORDERED_NODE_TYPE);
    let bookImgSrc = resultImg.singleNodeValue.src;
    imgCard.src = bookImgSrc.replace('mtiny', 'large');
    divCard.appendChild(imgCard);
    console.log(imgCard);

Could someone point out what is wrong in the code and how to get the src url?有人可以指出代码中有什么问题以及如何获取 src url 吗？

I suspect the xPathResult may be wrong.我怀疑 xPathResult 可能是错误的。

Answer 1

you can select the element then use this following code :您可以选择元素，然后使用以下代码：

var myElementSrc = mySelectedElement.src ;

this is simple.这很简单。

Answer 2

probably your xpath is not correct.可能您的 xpath 不正确。 I tried to use "//div/div/a/img and it works我尝试使用"//div/div/a/img并且它有效

使用 javascript 提取 DOM HTML

问题描述

2 个解决方案

解决方案1
0 已采纳 2022-06-18 11:55:45

解决方案2
0 2022-06-20 02:59:41

使用 javascript 提取 DOM HTML

问题描述

2 个解决方案

解决方案1 0 已采纳 2022-06-18 11:55:45

解决方案2 0 2022-06-20 02:59:41

解决方案1
0 已采纳 2022-06-18 11:55:45

解决方案2
0 2022-06-20 02:59:41