Extract specific value from HTML with bs4

Question

I am trying to extract the value of an HTML tag. The HTML is returned in the response of a site after I make a post request to it.

The HTML snippet I want to parse looks like this:

<input name=\"secret\" type=\"hidden\" value=\"eyJ0aW1lc3RhbXAiOjE1NTQ2NjIyMzksImFjdGlvbiI6IlwvY2FydFwvcGx1c1wvMWNlNzUtMTEzNzYzIn0=\">\n    <input name=\"product_id\" type=\"hidden\" value=\"156863\">\n    <input name=\"product_bs_id\"  type=\"hidden\" value=\"113763\">\n    <input type=\"hidden\" name=\"amount\" type=\"text\" value=\"1\">\n

I want the value with the name secret

I tried solving it like this:

soup=bs(req.text, 'lxml')
secret=soup.find('input',{'name':'secret'})['value']

Because of those Backslashes I also tried it like this:

secret=soup.find('input',{'name':'secret'})['value']

But I still always got the error 'NoneType not subscriptable'. Basically it didn't find it. Any clue? Thanks a lot.

Answer 1

Use CSS Selector to retrieve the value.

from bs4 import BeautifulSoup as bs

html='''<input name=\"secret\" type=\"hidden\" value=\"eyJ0aW1lc3RhbXAiOjE1NTQ2NjIyMzksImFjdGlvbiI6IlwvY2FydFwvcGx1c1wvMWNlNzUtMTEzNzYzIn0=\">\n
<input name=\"product_id\" type=\"hidden\" value=\"156863\">\n
<input name=\"product_bs_id\"  type=\"hidden\" value=\"113763\">\n
<input type=\"hidden\" name=\"amount\" type=\"text\" value=\"1\">\n    '''

soup=bs(html, 'lxml')
secret=soup.select_one('input[name^=\\secret]')
print(secret['value'])

Output:

eyJ0aW1lc3RhbXAiOjE1NTQ2NjIyMzksImFjdGlvbiI6IlwvY2FydFwvcGx1c1wvMWNlNzUtMTEzNzYzIn0=

Extract specific value from HTML with bs4

Question

1 answers

solution1
1 ACCPTED 2019-04-07 21:21:25

Extract specific value from HTML with bs4

Question

1 answers

solution1 1 ACCPTED 2019-04-07 21:21:25

solution1
1 ACCPTED 2019-04-07 21:21:25