[英]Python Scrapy Get HTML <script> tag
I have a project and i need the get script in html code. 我有一个项目,我需要HTML代码中的get脚本。
<script>
(function() {
... / More Code
Level.grade = "2";
Level.level = "1";
Level.max_line = "5";
Level.cozum = 'adım 12\ndön sağ\nadım 13\ndön sol\nadım 11';
... / More Code
</script>
How i get only " adım 12\\ndön sağ\\nadım 13\\ndön sol\\nadım 11 " this code? 我如何仅获得“adım12 \\ndönsağ\\nadım13 \\ndönsol \\nadım11”此代码?
Thanks for Helps 感谢您的帮助
Use Regex to do that 使用正则表达式来做到这一点
First grab the content of that SCRIPT tag like 首先抓取该SCRIPT标签的内容,例如
response.css("script").extract_first()
And then use this regex 然后使用这个正则表达式
(Level\.cozum = )(.*?)(\;)
See demo here https://regex101.com/r/YxHRmR/1 在此处查看演示https://regex101.com/r/YxHRmR/1
This is code 这是代码
import re
regex = r"(Level\.cozum = )(.*?)(\;)"
test_str = ("<script>\n"
" (function() {\n"
" ... / More Code\n"
" Level.grade = \"2\";\n\n"
" Level.level = \"1\";\n\n"
" Level.max_line = \"5\";\n\n"
" Level.cozum = 'adım 12\\ndön sağ\\nadım 13\\ndön sol\\nadım 11'; \n"
"... / More Code\n"
"</script>")
matches = re.findall(regex, test_str, re.MULTILINE)
print(matches)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.