简体   繁体   中英

How to scrape a webpage using JavaScript?

I have the following DOM parts that I want to scrape. At first there is a drop down list after and after an element is selected, a second drop down list is presented which if an element is selected again a button is displayed which, when pressed presents a div with data that I need to scrape.

DOM:

<div class="input-field">
   <select name="Ingredients" id="Ingredients">
     <option value="" disabled selected>Select an Ingredient</option>
     <option value="01">Ingredient1</option>
     <option value="02">Ingredient2</option>
     <option value="03">Ingredient3</option>
   </select>
</div>

<div class="input-field">
   <select name="methods" id="methods">
     <option value="04" disabled selected>Select method</option>
     <option value="05">method1</option>
     <option value="06">method2</option>
     <option value="07">method3</option>
   </select>
</div>

<div class="col s4">
   <button style="display: none;" id="check">Display Recipe</button>
</div>

<div id="recipe">
  First recipe
  <br>
  Second recipe
  <br>
  Third recipe
  <br>
</div>

For fetching the html I use axios and Jsdom to manipulate the DOM as follows:

const { data } = await axios.get(url);
  let str = 'option[value^="05"]';
  let str2 = 'option[value^="01"]';
  let str3 = 'button[id^="check"]';

const dom = new JSDOM(data, {
    runScripts: "dangerously",
    resources: "usable",
  });

  const { document } = dom.window;
  const input1 = document.querySelector(str);
  input1.click();
  console.log(input1);
  const input2 = document.querySelector(str);
  input2.click();
  console.log(input2);
  const button = document.querySelector(str3);
  button.click();
  console.log(button);
  console.log(data);

I know that retrieving the html works (I can display), also the I can display the textContent of each element(2 inputs, 1 button). However, I can't figure out how to select each input and then display and scrape the recipe by clicking the button.

As @nlta mentioned I had to check the network tab in dev tools, where I found the right query and then used it in a GET request, which retrieved the data I wanted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM