简体   繁体   中英

Extracting text from page td element with Cheerio

This Meteor server code uses Cheerio/jQuery to get the value "44 years" from the sixth td element in a web page which contains the following html;
It gives undefined, Any idea how to do it? Thanks

<tr>
  <td class="label" style="white-space:nowrap">Nmae:</td>
  <td>&nbsp;</td>
  <td colspan="2" class="bodyText">male</td>
  <td colspan="2" class="label">Age:</td>
  <td class="bodyText" width="1%">&nbsp;</td>
  <td colspan="2" class="bodyText">44 years</td>  <--------------
</tr>
$('td[class=label]').each((i, elem) => { //<------ $ is cheerio object
  let str = elem.innerHTML;
  console.log(str);    //<---------- undefined
  if (str === '44 years') {
    console.log('found it');
    let age = elem.nextSibling.nextSibling.innerHTML;
    console.log(age);
    return false;
  }
});

If you want to retrieve last column value ie 44 and add check on it then try this code. You can write your logic in if loop

 table.find('tr').each(function (i, elem){ 
       var $tds = $(this).find('td');
         var   str= $tds.eq(5).text();
      console.log(str);    //<-- last column value
      if (str === '44 years') {
        console.log('found it');
    // write your code here
      }
    });

In here this selector: $('td[class=label]').each((i, elem) => {

is actually saying "Cycle every TD DOM elements which has the class label", and in your HTML, the only columns that will cycle will be Name, and Age:

<td class="label" style="white-space:nowrap">Nmae:</td>
<td colspan="2" class="label">Age:</td>

So when you do this code:

let str = elem.innerHTML;
if (str === '44 years') {

It would never go inside the "if statement", because the only columns they are cycling doesn't have '44 years', they will be "Nmae:" and "Age:" only.

Also I noticed that you are putting the class attribute of the HTML element first, and on the second element after the "colspan" attribute, that might be confusing when you are writing your code.

So the solution is to change the selector to cycle through each element like this:

//Select all "td" within "tr"
// vvv 
$('tr td').each((i, elem) => { //<------ $ is cheerio object
  let str = elem.innerHTML;
  console.log(str);    //<---------- undefined
  if (str === '44 years') {
    console.log('found it');
    let age = elem.nextSibling.nextSibling.innerHTML;
    console.log(age);
    return false;
  }
});

If you leave it like that it will find the years, but it will also throw an error because the last "td" element will look for its siblings, but they are none because its the last element.

So, if you already found it, then you only have to show the element once found, like this:

//Select all "td" within "tr"
// vvv 
$('tr td').each((i, elem) => { 
  let str = elem.innerHTML;
  console.log(str);    //<---------- String for each column
  if (str === '44 years') {   
    console.log('found it');
    let age = elem;    
    console.log(age);
    return false;
  }
});

Hope it helps.

Leo.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM