Get “<Response [403]>” using request.post in Python

Question

I was trying to get search result from a website, however I got "Response[403]" message, I've found similar post solving 403 error by adding headers to request.post, however it didn't work for my problem. What should I do to correctly get the result I want?

from urllib.request import urlopen
import urllib.parse
import urllib.request
import requests
from bs4 import BeautifulSoup 

url="https://www.metal-archives.com/"
html= urlopen(url)
print("The keyword you entered to search is: %s\n" % 'Bathory')
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
result=requests.post(url, data='Bathory', headers=headers)
print(result.content)

Answer 1

First of all, you don't need the headers as you can see that you're getting status code 200 :

>>> r = requests.get('https://www.metal-archives.com')
>>> r.status_code
200

If you want to search for anything, you can see that the url changes to

https://www.metal-archives.com/search?searchString=bathory

That means, you can directly format it using this:

>>> keyword = 'bathory'
>>> r = requests.get('https://www.metal-archives.com/search?searchString='+keyword)
>>> r.status_code
200
>>> 'bathory' in r.text
True

Answer 2

If you check HTML you'll find that form method is GET (may be that's why you get 403 error):

<form id="search_form" action="https://www.metal-archives.com/search" method="get">

so all you need is to construct search URL:

#Music genre search
result=requests.get( "https://www.metal-archives.com/search?searchString={0}&type=band_genre".format("Bathory") )
#Band name search
result=requests.get( "https://www.metal-archives.com/search?searchString={0}&type=band_name".format("Bathory") )

Get “<Response [403]>” using request.post in Python

Question

2 answers

solution1
1 2018-02-23 15:21:50

solution2
1 2018-02-23 19:52:10

Get “<Response [403]>” using request.post in Python

Question

2 answers

solution1 1 2018-02-23 15:21:50

solution2 1 2018-02-23 19:52:10

solution1
1 2018-02-23 15:21:50

solution2
1 2018-02-23 19:52:10