[英]Expecting 404 but getting 200 HTTP status code
I've been using the following command to get the status code of a Tweet:我一直在使用以下命令来获取推文的状态代码:
import requests
response = requests.get("https://twitter.com/jack/38373837")
status_code = response.status_code
print(status_code)
----------------------
200
I expected 404. However, I got 200.我预计是 404。但是,我得到了 200。
Is there another command, or perhaps even a Python package, that accurately determines a page's HTTP status code?是否有另一个命令,或者甚至是 Python package,可以准确地确定页面的 HTTP 状态代码?
This is happening because it actually loads Twitter and Twitter can't load the post.发生这种情况是因为它实际上加载了 Twitter 而 Twitter 无法加载帖子。 So the response is 200 which means OK.所以响应是 200,这意味着 OK。 Cause you can reach Twitter.因为您可以访问 Twitter。
If you try it with an API or with a website that has no protection, you can get error 404!如果您尝试使用 API 或没有保护的网站,您可能会收到错误 404!
Try with "https://cidqu.net/thisisnotexist.html"尝试使用“https://cidqu.net/thisisnotexist.html”
This page doesn't load anything so it will give you the error 404.这个页面没有加载任何东西,所以它会给你错误 404。
I tried on my laptop and i got 200 as well.我在我的笔记本电脑上试了一下,我也得到了 200 个。 I used the -I to get only the headers.我使用 -I 仅获取标题。
-I, --head Show document info only
nabil@LAPTOP:~$ curl -I https://twitter.com/jack/38373837
HTTP/2 200
date: Mon, 31 Jan 2022 20:08:45 GMT
expiry: Tue, 31 Mar 1981 05:00:00 GMT
pragma: no-cache
server: tsa_f
set-cookie: guest_id=v1%3A164365972506197722; Max-Age=34214400; Expires=Fri, 03 Mar 2023 20:08:45 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None
content-type: text/html; charset=utf-8
x-powered-by: Express
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
last-modified: Mon, 31 Jan 2022 20:08:45 GMT
x-frame-options: DENY
x-xss-protection: 0
x-content-type-options: nosniff
content-security-policy: connect-src 'self' blob: https://*.giphy.com https://*.pscp.tv https://*.video.pscp.tv https://*.twimg.com https://api.twitter.com https://api-stream.twitter.com https://ads-api.twitter.com https://aa.twitter.com https://caps.twitter.com https://media.riffsy.com https://pay.twitter.com https://sentry.io https://ton.twitter.com https://twitter.com https://upload.twitter.com https://www.google-analytics.com https://accounts.google.com/gsi/status https://accounts.google.com/gsi/log https://app.link https://api2.branch.io https://bnc.lt wss://*.pscp.tv https://vmap.snappytv.com https://vmapstage.snappytv.com https://vmaprel.snappytv.com https://vmap.grabyo.com https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net ; default-src 'self'; form-action 'self' https://twitter.com https://*.twitter.com; font-src 'self' https://*.twimg.com; frame-src 'self' https://twitter.com https://mobile.twitter.com https://pay.twitter.com https://cards-frame.twitter.com https://accounts.google.com/ https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/; img-src 'self' blob: data: https://*.cdn.twitter.com https://ton.twitter.com https://*.twimg.com https://analytics.twitter.com https://cm.g.doubleclick.net https://www.google-analytics.com https://www.periscope.tv https://www.pscp.tv https://media.riffsy.com https://*.giphy.com https://*.pscp.tv https://*.periscope.tv https://prod-periscope-profile.s3-us-west-2.amazonaws.com https://platform-lookaside.fbsbx.com https://scontent.xx.fbcdn.net https://scontent-sea1-1.xx.fbcdn.net https://*.googleusercontent.com https://imgix.revue.co; manifest-src 'self'; media-src 'self' blob: https://twitter.com https://*.twimg.com https://*.vine.co https://*.pscp.tv https://*.video.pscp.tv https://*.giphy.com https://media.riffsy.com https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net; object-src 'none'; script-src 'self' 'unsafe-inline' https://*.twimg.com https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://www.google-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js 'nonce-NzE1OGUzMjgtYWVkZS00ZGNkLWI4ZjctNDQwYmU1ODA2NjJh'; style-src 'self' 'unsafe-inline' https://accounts.google.com/gsi/style https://*.twimg.com; worker-src 'self' blob:; report-uri https://twitter.com/i/csp_report?a=O5RXE%3D%3D%3D&ro=false
strict-transport-security: max-age=631138519
cross-origin-opener-policy: same-origin-allow-popups
cross-origin-embedder-policy: unsafe-none
x-response-time: 185
x-connection-hash: d1535e6f6d60a343d5d9adfbe574b67f65b771b35fcc93c7ea887705bffb2ba8
Try this endpoint to check tweet if exists or not:尝试使用此端点检查推文是否存在:
import requests
import json
# https://twitter.com/jack/status/1247616214769086465
tweet_id = 1247616214769086465
url = 'https://twitter.com/i/api/graphql/_iJccJ-mHcyaV0nq_odmBA/TweetDetail'
# Request Headers
headers = {'Host': 'twitter.com',
'sec-ch-ua': '',
'x-twitter-client-language': 'en',
'x-csrf-token': '9d2d0361bd589118ff41e56619327537',
'sec-ch-ua-mobile': '?0',
'authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs'
'%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA',
'content-type': 'application/json',
'x-guest-token': '1488257541251469319',
'x-twitter-active-user': 'yes',
'sec-ch-ua-platform': '',
'accept': '*/*',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://twitter.com/GioCellRed/status/1488257200195842048',
'accept-language': 'en-US,en;q=0.9',
'cookie': 'guest_id_ads=v1%3A164069712670696178; guest_id=v1%3A164069712670696178; '
'guest_id_marketing=v1%3A164069712670696178; personalization_id=',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/97.0.4692.99 Safari/537.36 Edg/97.0.1072.76'}
# Request Parameters
variables = {"focalTweetId": tweet_id, "referrer": "search",
"controller_data": "DAACDAAFDAABDAABDAABCgABAAAAAAAAgEAAAAwAAgoAAQAAAAAAAAAICgACTOJ7aVQ"
"/L38LAAMAAAAFU29uaWEMAAQMAAELAAEAAAAFU29uaWELAAIAAAAkOTUxYmYyZjItMDl"
"hNC00ZTlmLWJkZWItMTBhYTFjMmU5YjBhAAAKAAUbtNSIOd+CdQAAAAAA",
"with_rux_injections": False,
"includePromotedContent": True,
"withCommunity": True,
"withQuickPromoteEligibilityTweetFields": True,
"withBirdwatchNotes": False,
"withSuperFollowsUserFields": True,
"withDownvotePerspective": False,
"withReactionsMetadata": False,
"withReactionsPerspective": False,
"withSuperFollowsTweetFields": True,
"withVoice": True, "withV2Timeline": False,
"__fs_interactive_text": False,
"__fs_responsive_web_uc_gql_enabled": False,
"__fs_dont_mention_me_view_api_enabled": False}
params = {'variables': json.dumps(variables)}
with requests.get(url, headers=headers, params=params) as resp:
result = resp.json()
print('Error:', ("errors" in result),
'\nSuccess:', ("data" in result))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.