简体   繁体   English

循环显示微数据以提取itemprop和文本值

[英]Loop through Microdata to extract itemprop and text value

Trying to loop over a HTML+Microdata page to get the product info from Schema.org. 尝试遍历HTML + Microdata页面以从Schema.org获取产品信息。 HTML could have unknown children of children. HTML可能有未知的孩子的孩子。 How would I do multiple loops on children of unknown or is it best to use find? 如何对未知的孩子做多个循环,或者最好使用find吗?

So I want to grab all schema data a put in an array: 所以我想获取放入数组中的所有模式数据:

  <span itemprop="name">Product Name</span>

So the above would be save to an array [name: "Product Name"] . 所以上面会保存到一个数组[name: "Product Name"]

      function productData(elem) {
    // Get the children
    console.log("elem 1", elem)
    console.log("elem 2", elem[0])

    if (elem[0]) {
      if (elem[0].hasChildNodes()) {
        elem[0].childNodes.forEach(function (item) {
          console.log("item", item)
          console.log("item chilnodes", item.childNodes)
          return productData(item);
        });
      }
    }
  }


  // Get All Products on the page
  const product = document.querySelectorAll('[itemtype="http://schema.org/Product"]');

  productData(product)

While this question is missing some detail, one powerful tool for traversing unknown levels of a tree-like structure is recursion : 虽然这个问题缺少一些细节,但是一个用于遍历未知级别的树状结构的强大工具是递归

function processData (product) {
  if(product.length) {
    const productChildrem =  product[0].childNodes;

    // process this node

    productChildrem.forEach(function (child) {
       return processData(child)
    });
}

Through repeated function calls to each child, you will eventually process all of them. 通过对每个孩子的重复函数调用,您最终将处理所有这些孩子。

If you want your own Microdata parser then you can start from something like this. 如果你想要自己的Microdata解析器,那么你可以从这样的东西开始。 Off course you need to elaborate it a lot. 当然,你需要详细说明。 For example, some properties are array s and so on. 例如,某些属性是array ,依此类推。

function getItem(elem) {
  var item = {
    '@type': elem.getAttribute('itemtype')
  };
  elem.querySelectorAll('[itemprop]')
    .forEach(function(el) {
      var prop = el.getAttribute('itemprop');
      //special cases
      if (el.hasAttribute('itemscope'))
        item[prop] = item[prop] ? [...item[prop], getItem(el)] : getItem(el); //recursion here
      else if (prop == 'url')
        item[prop] = el.getAttribute('href');
      else if (prop == 'image')
        item[prop] = el.getAttribute('src');
      else
        item[prop] = el.innerText;
      });
   return item;
}
var products = [];

document.querySelectorAll('[itemtype*="http://schema.org/Product"]') //*= for multiple types
  .forEach(function(prod) {
    products.push(getItem(prod));
  });

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM