Extract token from within <script> tags BeautifulSoup4, Requests

Question

I'm trying to isolate the securityToken from an HTML response. The securityToken is within tags though.

I've been able to isolate the tag with the code below:

import requests
from bs4 import BeautifulSoup
import re

url = 'https://obe.sandals.com/read-land-availability/'
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
soup= BeautifulSoup(r.text, 'html.parser')
mytext = soup.find('script', text = re.compile('securityToken:'))

print(mytext)

Here is the output, but I cannot figure out the last step to extract the securityToken

<script> window._app.page = { jsView: './views/step1/Vacation', securityToken: "BF8394B1DD5481AF43BE2AF02243903F121D26327E83ADC13785F6EF739B5870", subSessionId: "6D71C585C7F51CF105B3100A473635ACF3637329F2C1ABAADB1F2827832562D8", step: 1 }; </script>

Process finished with exit code 0

Answer 1

如果您使用 'html5lib' 而不是 'html.parser'，并且安全令牌的位置始终相同：

mytext.split('securityToken: "')[1].split('", subSessionId:')[0]

Answer 2

To extract the value of securityToken try the following:

import re
import requests
from bs4 import BeautifulSoup


url = 'https://obe.sandals.com/read-land-availability/'
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
soup = BeautifulSoup(r.text, 'html.parser')
mytext = soup.find('script', text = re.compile('securityToken:'))


print(re.search(r'securityToken: "(.*?)"', str(mytext)).group(1))

Output:

5EFDCE1D62C5F1C1369EF3629F921B0F90301ACB51C5FD24321D7FB58D04DE50

Extract token from within <script> tags BeautifulSoup4, Requests

Question

2 answers

solution1
0 2021-01-17 20:43:33

solution2
0 ACCPTED 2021-01-17 20:51:51

Extract token from within <script> tags BeautifulSoup4, Requests

Question

2 answers

solution1 0 2021-01-17 20:43:33

solution2 0 ACCPTED 2021-01-17 20:51:51

solution1
0 2021-01-17 20:43:33

solution2
0 ACCPTED 2021-01-17 20:51:51