I'm trying to isolate the securityToken from an HTML response. The securityToken is within tags though.
I've been able to isolate the tag with the code below:
import requests
from bs4 import BeautifulSoup
import re
url = 'https://obe.sandals.com/read-land-availability/'
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
soup= BeautifulSoup(r.text, 'html.parser')
mytext = soup.find('script', text = re.compile('securityToken:'))
print(mytext)
Here is the output, but I cannot figure out the last step to extract the securityToken
<script> window._app.page = { jsView: './views/step1/Vacation', securityToken: "BF8394B1DD5481AF43BE2AF02243903F121D26327E83ADC13785F6EF739B5870", subSessionId: "6D71C585C7F51CF105B3100A473635ACF3637329F2C1ABAADB1F2827832562D8", step: 1 }; </script>
Process finished with exit code 0
如果您使用 'html5lib' 而不是 'html.parser',并且安全令牌的位置始终相同:
mytext.split('securityToken: "')[1].split('", subSessionId:')[0]
To extract the value of securityToken
try the following:
import re
import requests
from bs4 import BeautifulSoup
url = 'https://obe.sandals.com/read-land-availability/'
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
soup = BeautifulSoup(r.text, 'html.parser')
mytext = soup.find('script', text = re.compile('securityToken:'))
print(re.search(r'securityToken: "(.*?)"', str(mytext)).group(1))
Output:
5EFDCE1D62C5F1C1369EF3629F921B0F90301ACB51C5FD24321D7FB58D04DE50
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.