简体   繁体   English

期望 404 但得到 200 HTTP 状态码

[英]Expecting 404 but getting 200 HTTP status code

I've been using the following command to get the status code of a Tweet:我一直在使用以下命令来获取推文的状态代码:

import requests

response = requests.get("https://twitter.com/jack/38373837")

status_code = response.status_code


print(status_code)
----------------------
200

I expected 404. However, I got 200.我预计是 404。但是,我得到了 200。

Is there another command, or perhaps even a Python package, that accurately determines a page's HTTP status code?是否有另一个命令,或者甚至是 Python package,可以准确地确定页面的 HTTP 状态代码?

This is happening because it actually loads Twitter and Twitter can't load the post.发生这种情况是因为它实际上加载了 Twitter 而 Twitter 无法加载帖子。 So the response is 200 which means OK.所以响应是 200,这意味着 OK。 Cause you can reach Twitter.因为您可以访问 Twitter。

If you try it with an API or with a website that has no protection, you can get error 404!如果您尝试使用 API 或没有保护的网站,您可能会收到错误 404!

Try with "https://cidqu.net/thisisnotexist.html"尝试使用“https://cidqu.net/thisisnotexist.html”

This page doesn't load anything so it will give you the error 404.这个页面没有加载任何东西,所以它会给你错误 404。

I tried on my laptop and i got 200 as well.我在我的笔记本电脑上试了一下,我也得到了 200 个。 I used the -I to get only the headers.我使用 -I 仅获取标题。

 -I, --head          Show document info only
nabil@LAPTOP:~$ curl -I https://twitter.com/jack/38373837
HTTP/2 200
date: Mon, 31 Jan 2022 20:08:45 GMT
expiry: Tue, 31 Mar 1981 05:00:00 GMT
pragma: no-cache
server: tsa_f
set-cookie: guest_id=v1%3A164365972506197722; Max-Age=34214400; Expires=Fri, 03 Mar 2023 20:08:45 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None
content-type: text/html; charset=utf-8
x-powered-by: Express
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
last-modified: Mon, 31 Jan 2022 20:08:45 GMT
x-frame-options: DENY
x-xss-protection: 0
x-content-type-options: nosniff
content-security-policy: connect-src 'self' blob: https://*.giphy.com https://*.pscp.tv https://*.video.pscp.tv https://*.twimg.com https://api.twitter.com https://api-stream.twitter.com https://ads-api.twitter.com https://aa.twitter.com https://caps.twitter.com https://media.riffsy.com https://pay.twitter.com https://sentry.io https://ton.twitter.com https://twitter.com https://upload.twitter.com https://www.google-analytics.com https://accounts.google.com/gsi/status https://accounts.google.com/gsi/log https://app.link https://api2.branch.io https://bnc.lt wss://*.pscp.tv https://vmap.snappytv.com https://vmapstage.snappytv.com https://vmaprel.snappytv.com https://vmap.grabyo.com https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net ; default-src 'self'; form-action 'self' https://twitter.com https://*.twitter.com; font-src 'self' https://*.twimg.com; frame-src 'self' https://twitter.com https://mobile.twitter.com https://pay.twitter.com https://cards-frame.twitter.com https://accounts.google.com/  https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/; img-src 'self' blob: data: https://*.cdn.twitter.com https://ton.twitter.com https://*.twimg.com https://analytics.twitter.com https://cm.g.doubleclick.net https://www.google-analytics.com https://www.periscope.tv https://www.pscp.tv https://media.riffsy.com https://*.giphy.com https://*.pscp.tv https://*.periscope.tv https://prod-periscope-profile.s3-us-west-2.amazonaws.com https://platform-lookaside.fbsbx.com https://scontent.xx.fbcdn.net https://scontent-sea1-1.xx.fbcdn.net https://*.googleusercontent.com https://imgix.revue.co; manifest-src 'self'; media-src 'self' blob: https://twitter.com https://*.twimg.com https://*.vine.co https://*.pscp.tv https://*.video.pscp.tv https://*.giphy.com https://media.riffsy.com https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net; object-src 'none'; script-src 'self' 'unsafe-inline' https://*.twimg.com https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://www.google-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js  'nonce-NzE1OGUzMjgtYWVkZS00ZGNkLWI4ZjctNDQwYmU1ODA2NjJh'; style-src 'self' 'unsafe-inline' https://accounts.google.com/gsi/style https://*.twimg.com; worker-src 'self' blob:; report-uri https://twitter.com/i/csp_report?a=O5RXE%3D%3D%3D&ro=false
strict-transport-security: max-age=631138519
cross-origin-opener-policy: same-origin-allow-popups
cross-origin-embedder-policy: unsafe-none
x-response-time: 185
x-connection-hash: d1535e6f6d60a343d5d9adfbe574b67f65b771b35fcc93c7ea887705bffb2ba8

Try this endpoint to check tweet if exists or not:尝试使用此端点检查推文是否存在:

import requests
import json

# https://twitter.com/jack/status/1247616214769086465
tweet_id = 1247616214769086465

url = 'https://twitter.com/i/api/graphql/_iJccJ-mHcyaV0nq_odmBA/TweetDetail'

# Request Headers
headers = {'Host': 'twitter.com',
           'sec-ch-ua': '',
           'x-twitter-client-language': 'en',
           'x-csrf-token': '9d2d0361bd589118ff41e56619327537',
           'sec-ch-ua-mobile': '?0',
           'authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs'
                            '%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA',
           'content-type': 'application/json',
           'x-guest-token': '1488257541251469319',
           'x-twitter-active-user': 'yes',
           'sec-ch-ua-platform': '',
           'accept': '*/*',
           'sec-fetch-site': 'same-origin',
           'sec-fetch-mode': 'cors',
           'sec-fetch-dest': 'empty',
           'referer': 'https://twitter.com/GioCellRed/status/1488257200195842048',
           'accept-language': 'en-US,en;q=0.9',
           'cookie': 'guest_id_ads=v1%3A164069712670696178; guest_id=v1%3A164069712670696178; '
                     'guest_id_marketing=v1%3A164069712670696178; personalization_id=',
           'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                         'Chrome/97.0.4692.99 Safari/537.36 Edg/97.0.1072.76'}

# Request Parameters
variables = {"focalTweetId": tweet_id, "referrer": "search",
             "controller_data": "DAACDAAFDAABDAABDAABCgABAAAAAAAAgEAAAAwAAgoAAQAAAAAAAAAICgACTOJ7aVQ"
                                "/L38LAAMAAAAFU29uaWEMAAQMAAELAAEAAAAFU29uaWELAAIAAAAkOTUxYmYyZjItMDl"
                                "hNC00ZTlmLWJkZWItMTBhYTFjMmU5YjBhAAAKAAUbtNSIOd+CdQAAAAAA",
             "with_rux_injections": False,
             "includePromotedContent": True,
             "withCommunity": True,
             "withQuickPromoteEligibilityTweetFields": True,
             "withBirdwatchNotes": False,
             "withSuperFollowsUserFields": True,
             "withDownvotePerspective": False,
             "withReactionsMetadata": False,
             "withReactionsPerspective": False,
             "withSuperFollowsTweetFields": True,
             "withVoice": True, "withV2Timeline": False,
             "__fs_interactive_text": False,
             "__fs_responsive_web_uc_gql_enabled": False,
             "__fs_dont_mention_me_view_api_enabled": False}

params = {'variables': json.dumps(variables)}

with requests.get(url, headers=headers, params=params) as resp:
    result = resp.json()
    print('Error:', ("errors" in result),
          '\nSuccess:', ("data" in result))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM