While requesting multiple URL after getting couple of responses it starts giving me 403 error for the other urls.
i tried using user agents and proxy still problem exist. i also tried a delay of 0.5 sec .
im using - requests version = 2.22.0
Here is what (r.status_code, r.headers, r.text) looks like:
403 {'Allow': 'GET, POST, HEAD, PUT, PATCH, DELETE, OPTIONS', 'Content-Encoding': 'gzip', 'Content-Type': 'text/html; charset=UTF-8', 'Accept-Ranges': 'bytes, bytes, bytes, bytes', 'Content-Length': '1519', 'Date': 'Thu, 06 Feb 2020 10:34:40 GMT', 'Connection': 'keep-alive', 'set-cookie': 'machine_cookie=9581501972230; expires=Wed, 05 Feb 2025 10:34:40 GMT; path=/;', 'X-Served-By': 'cache-sea4466-SEA, cache-maa18327-MAA', 'X-Cache': 'MISS, MISS', 'X-Cache-Hits': '0, 0', 'X-Timer': 'S1580985280.913451,VS0,VE312', 'Vary': 'User-Agent, Accept-Encoding'} <!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Access to this page has been denied.</title>
<link href="https://fonts.googleapis.com/css?family=Open+Sans:300" rel="stylesheet">
<style>
html, body {
margin: 0;
padding: 0;
font-family: 'Open Sans', sans-serif;
color: #000;
}
.container {
align-items: center;
display: flex;
flex: 1;
justify-content: space-between;
flex-direction: column;
height: 100%;
}
.container > div {
width: 100%;
display: flex;
justify-content: center;
}
.container > div > div {
display: flex;
width: 80%;
}
.customer-logo-wrapper {
padding-top: 2rem;
flex-grow: 0;
background-color: #fff;
}
.customer-logo {
border-bottom: 1px solid #000;
}
.customer-logo > img {
padding-bottom: 1rem;
max-height: 50px;
max-width: 100%;
}
.page-title-wrapper {
flex-grow: 0; /* was 2, but that pushed it too far down the page */
}
.page-title {
flex-direction: column-reverse;
}
.content-wrapper {
flex-grow: 5;
}
.content {
flex-direction: column;
}
@media (min-width: 768px) {
html, body {
height: 100%;
}
}
</style>
<script>
window._pxAppId = 'PXxgCxM9By';
window._pxJsClientSrc = '/xgCxM9By/init.js';
window._pxHostUrl = '/xgCxM9By/xhr';
startTime = Date.now();
window._pxOnCaptchaSuccess = function(isValid){
var solutionTime = Math.floor((Date.now() - startTime) / 1000);
var reload = function(){ top.location.reload(); };
sendEvent("captcha/solved?px_uuid=" + window._pxUuid + "&time_to_solution=" + solutionTime + '&isValid=' + isValid, reload);
setTimeout(reload, 700);
};
function sendEvent(event, onload){
var xhr = new XMLHttpRequest();
xhr.open("GET", "/_sa_track/" + event);
if (onload) xhr.addEventListener("load", onload);
xhr.send();
}
</script>
<script type="text/javascript">window._pxVid = "";window._pxUuid = "47a70d80-48cc-11ea-860b-c96869955a6b";</script></head>
<body>
<section class="container">
<div class="page-title-wrapper">
<div class="page-title">
<h1>Please click “I am not a robot” to continue</h1>
</div>
</div>
<div class="content-wrapper">
<div class="content">
<div id="px-captcha"></div>
<p></p>
<p>
To ensure this doesn’t happen in the future, please enable Javascript and cookies in your browser.<br/>
Is this happening to you frequently? Please <a href="https://seekingalpha.userecho.com?source=captcha">report it on our feedback forum</a>.
</p>
<p>
If you have an ad-blocker enabled you may be blocked from proceeding. Please disable your ad-blocker and refresh.
</p>
<p>Reference ID: <span id="refid"></span></p>
</div>
</div>
<script>
document.getElementById("refid").innerHTML = window._pxUuid;
sendEvent("captcha/shown?px_uuid=" + window._pxUuid);
</script>
</section>
<script src="/xgCxM9By/captcha/PXxgCxM9By/captcha.js?a=c&m=0"></script>
</body>
</html>
Server prevents you from obtaining desired information by showing a 403 Forbidden
HTTP status code and a captcha to ensure that request is initiated by a human, not by a Python script. It is likely that remote served temporarily banned your session or your IP address.
There are some workarounds to avoid such ban from server, but there is no guarantee that you can overcome that restriction .
So I can only give you some advices:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.