I have a html as text in nodejs as follow:
var htmlText = `<div class="X7NTVe"> <a class="tHmfQe" href="/link1"> <div class="am3QBf"> <div> <span> <div class="BNeawe deIvCb AP7Wnd"> <span dir="rtl">My First Text</span> </div> </span> </div> </div> </a> <div class="HBTM6d XS7yGd"> <a href="/anotherLink1"> <div class="BNeawe mAdjQc uEec3 AP7Wnd">></div> </a> </div> </div> <div class="x54gtf"></div> <div class="X7NTVe"> <a class="tHmfQe" href="/link2"> <div class="am3QBf"> <div> <span> <div class="BNeawe deIvCb AP7Wnd"> <span dir="rtl">My Second Text</span> </div> </span> </div> </div> </a> <div class="HBTM6d XS7yGd"> <a href="/anotherLink2"> <div class="BNeawe mAdjQc uEec3 AP7Wnd">></div> </a> </div> </div> <div class="x54gtf"></div>`
Now I Want to fetch text form it as array. In abow example it must return My First Text
and My Second Text
. How can I do it?
Note : I want to do it in nodejs note in javascript.
With cheerio:
let $ = cheerio.load(html)
let strings = $('div[class="BNeawe deIvCb AP7Wnd"]>span[dir]')
.get().map(span => $(span).text())
replace all tags with regex /<[^>]*>/g
.
parse html with jsdom , and access html node via js document api.
method#2 is much more flexible.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.