简体   繁体   中英

How can I send a URL from the browser to a server to be scraped while using React?

To avoid CORS errors when attempting to scrape from inside the browser, I want to hold a scraper inside a server. How can I send a URL generated in the browser to a server, have the server scrape and organize the content, and then send the data back, preferably as an Object?

Try cURL Request for Node.js if you are running Node.js as your server.

 const curl = new (require( 'curl-request' ))(); curl.get('https://www.google.com') .then(({statusCode, body, headers}) => { console.log(statusCode, body, headers) }) .catch((e) => { console.log(e); });

Just like another Web page/Web app with JS/JS framework, a common method to communicate with the server is by using AJAX. For instance, we can use Axios or fetch() method of HTML5 Fetch API.

In React, Axios is one of the most used AJAX library. It is an independent library made special for HTTP transaction on the client side. Example of using Axios:

axios.post('https://your-server.com/your-path', {
    'url': 'https://url-input-by-user.com'
  })
  .then(function (response) {
    console.log(response);
    //Do your action when success/get response from server
  })
  .catch(function (error) {
    console.log(error);
    //Error handling is here
  });

Example using fetch() API ( Must implement polyfill for IE & Edge ):

fetch( 'https://your-server.com/your-path', {
    method: "POST", // *GET, POST, PUT, DELETE, etc.
    mode: "cors", // no-cors, cors, *same-origin
    cache: "no-cache", // *default, no-cache, reload, force-cache, only-if-cached
    credentials: "same-origin", // include, *same-origin, omit
    headers: {
        "Content-Type": "application/x-www-form-urlencoded",
    },
    redirect: "follow", // manual, *follow, error
    referrer: "no-referrer", // no-referrer, *client
    body: { 'url': 'https://url-input-by-user.com' }, // body data type must match "Content-Type" header
    })
    .then(response => response.json()); // parses response to JSON



USING "Browser Push data" :

If The "Scrapping" process is running in a significant amount of time, using "Browser push data" mechanism may fit for it. Because if using AJAX, it tends to have "timeout" when it takes too long. While using "Push data" mechanism, it is kind of asynchronous communication between client and server. You can research more about using "Websocket" and "Server sent event" AKA "SSE" . For your reference:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM