I am trying to access the following Japanese site and scrape data from a table, but I am struggling to login using Google Apps Script. I need to use a solution that does not rely on a desktop and can be done completely online. I am not that experienced with web development/web scraping, so I'm basically learning as I go.
I have the username and password, but:
2.The login page uses CORS and AWS api to Authenticate, so there are no cookies until I have successfully logged in and send a GET request via browser.
3.There are multple tokens: x-logview-token which is within the response to the POST request for logging in, and a Page Token is generated for each page.
Response to Login Post Request:
{"username":"user@gmail.com","token":"this-is-the-token-value","enableDigits":true}
I am thinking of using cookies from the browser's GET Request to recreate it and send it through Google Apps Scripts. Is there someway to bypass the login or use the cookies to login?
<!DOCTYPE html>
<html lang=en>
<head>
<meta charset=utf-8>
<meta http-equiv=X-UA-Compatible content="IE=edge">
<meta name=viewport content="width=device-width,initial-scale=1">
<link rel=icon href=/favicon.ico>
<link rel=stylesheet href=//cdn.materialdesignicons.com/3.4.93/css/materialdesignicons.min.css>
<title>123ROBO 通話履歴</title>
<link href=/css/app.5339eed8.css rel=preload as=style>
<link href=/css/chunk-vendors.8b9ade74.css rel=preload as=style>
<link href=/js/app.32f2c21e.js rel=preload as=script>
<link href=/js/chunk-vendors.cd62bd72.js rel=preload as=script>
<link href=/css/chunk-vendors.8b9ade74.css rel=stylesheet>
<link href=/css/app.5339eed8.css rel=stylesheet>
</head>
<body><noscript><strong>We're sorry but logview doesn't work properly without JavaScript enabled. Please enable it to continue.</strong></noscript>
<div id=app></div>
<script src=/js/chunk-vendors.cd62bd72.js></script>
<script src=/js/app.32f2c21e.js></script>
</body>
</html>
Here is the website: https://calllog-dev.123robo.com/#/login
Here is the Code I have been trying to use:
function loginTest(){
//Added Basic Authorization
var userID = 'user@gmail.com';
var userPW = 'password' ;
var url = 'https://dbp3xa4g5g.execute-api.us-west-2.amazonaws.com/dev/users/authenticate';
//Added a body as pointed out by Mark. Added Request Headers as suggested by pguardiario
const requestOptions = {
method: 'POST',
headers: { 'Content-Type': 'application/json',
'authority': 'dbp3xa4g5g.execute-api.us-west-2.amazonaws.com',
'path': '/dev/users/authenticate',
'scheme': 'https',
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,ja;q=0.8',
'content-type': 'application/json',
'origin': 'https://calllog-dev.123robo.com',
'referer': 'https://calllog-dev.123robo.com/',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'cross-site',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36',
},
var response = UrlFetchApp.fetch(url, requestOptions);
Logger.log(response);
Logger.log(response.getContentText("UTF-8"));
}
You can't just send an Authorization
header, when it expects a body
:
const requestOptions = {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ username, password })
}
And this page has one general DOM issue, along with some typos:
[DOM] Password field is not contained in a form: (More info: https://www.chromium.org/developers/design-documents/create-amazing-password-forms )
<input type="password" autocomplete="on" class="input">
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.