简体   繁体   English

如何使用Cheerio从网站上抓取数据

[英]How to scrape data from website with cheerio

I am trying to scrape the Ark: Survival Evolved wiki with no success. 我试图刮除《方舟:生存进化》维基,但没有成功。 Nested elements and same class-name elements throwing me off. 嵌套元素和相同的类名元素让我大吃一惊。 https://ark.gamepedia.com/Pteranodon https://ark.gamepedia.com/Pteranodon

Have tried searching the forums and cannot find the answer to my problem. 尝试搜索论坛,但找不到我的问题的答案。

const $ = cheerio.load(html);
const $dossier = $('.info-framework');
const $domestication = $dossier.find('div:nth-child(4)');

i manage to grab the div that contains the content i need, but everything i try from here ends in undefined. 我设法抓住包含我需要的内容的div,但是我从此处尝试的所有操作均以未定义结尾。 specifically i am trying to grab the "tameable", "rideable" & "breedable" elements. 具体来说,我试图抓住“可驯服”,“可乘”和“可繁殖”的元素。 If someone could point me in the right direction, or show me how to grab the data so i could then learn and hopefully grab the rest of the data i need that would be great. 如果有人可以指出正确的方向,或者告诉我如何获取数据,那么我可以学习并希望获取我需要的其余数据,那将是很好的。

Here some example you can build up on: 在这里可以建立一些示例:

 const abilities = Array.from($('.info-unit').eq(6).find('.info-X3-33')).map(element => element.innerText) 

EDIT: 编辑: 在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM