简体   繁体   English

遍历 div 以获得内部带有 href 的嵌套 div

[英]loop through divs for nested divs with href inside

<div class="view-content">
    <div class="views-row views-row-1">   
        <div class="views-field">
            <span class="field-content">
                <a href="link1">Name for link1
                    <img src="image1">
                </a>
            </span>
        </div>
        <div class="views-field-title">
            <span class="field-content">
                <a href="link1">
                </a>
            </span>
        </div>
    </div>
    <div class="views-row views-row-2">
        <div class="views-field">
          <span class="field-content">
              <a href="link2">Name for Link2
                  <img src="image2">
              </a>
          </span>
        </div>
        <div class="views-field-title">
            <span class="field-content">
                <a href="link2">
                </a>
            </span>
      </div>
    </div>

I am using node with request, and cheerio to request the data and scrape accordingly.我正在使用带有请求的节点和 cheerio 来请求数据并相应地抓取。

I am seeking the href from link1 and link2, I got it to work for one link, but it does not scale out when I try to loop it.我正在寻找 link1 和 link2 的 href,我让它为一个链接工作,但是当我尝试循环它时它不会扩展。

  const data ={
       link:"div.views-field > span > a" 
   },
   pageData = {};
   Object.keys(data).forEach(k => {
       pageData[k] = $(data[k]).attr("href");});

     console.log(pageData);

Your approach with $(data[k]).attr("href");您使用$(data[k]).attr("href");的方法is the right idea, but there's no loop here.是正确的想法,但这里没有循环。 There should be 2 elements matching this selector but your code only grabs the first.应该有 2 个元素与此选择器匹配,但您的代码只获取第一个。

Changing this to a [...$(data[k])].map(e => $(e).attr("href")) lets you get href attributes from all matching elements.将其更改为[...$(data[k])].map(e => $(e).attr("href"))可以让您从所有匹配元素中获取href属性。

I'm not crazy about pageData being global and using a forEach when a map seems more appropriate, so here's my suggestion:我对pageData是全局的并在map似乎更合适时使用forEach并不疯狂,所以这是我的建议:

const $ = cheerio.load(html);
const data = {
  link: "div.views-field > span > a",
};
const pageData = Object.fromEntries(
  Object.entries(data).map(([k, v]) =>
    [k, [...$(v)].map(e => $(e).attr("href"))]
  )
);
console.log(pageData);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM