[英]How do I scrape constantly updated JavaScript post-login using Python?
I know there are many similar questions, however, they are all piecemeal to the problem I have, and I haven't been successful in putting the information together. 我知道有很多类似的问题,但是,它们都是我遇到的问题的零碎部分,而且我在将信息整合在一起方面还没有成功。
I am using a FLIR ax8 thermal camera, and this camera has a web-interface that one can interact with via ethernet. 我正在使用FLIR ax8热像仪,并且该热像仪具有一个Web界面,可以通过以太网与之交互。 Long story short, temperature values are constantly displayed and updated, and I would like to scrape those values. 长话短说,温度值会不断显示和更新,我想抓取这些值。 I would like to do this without opening a browser with a GUI, and just be able to call every so often to get them. 我想做到这一点而无需打开带有GUI的浏览器,并且能够经常调用它来获取它们。
The first step is a simple login page, located at "cameraIP"/login. 第一步是一个简单的登录页面,位于“ cameraIP” /登录。 It's very basic, but I need a solution that gets me through this, and be able to maintain the login session. 这是非常基本的,但是我需要一个解决方案来帮助我解决这个问题,并且能够维护登录会话。 Then it's just the interface. 那只是接口。 Attached are two images, the first showing interface as seen in Chrome, and the second a terminal output of what I scraped using Python's Requests module. 所附的是两张图片,第一张显示的是在Chrome中看到的界面,第二张是我使用Python的Requests模块抓取的内容的终端输出。
As you can see, the numbers are clearly not there, as they are rendered by JavaScript. 如您所见,数字显然不存在,因为它们是由JavaScript渲染的。 This is essentially all I have to work with. 这基本上就是我要做的全部工作。 If someone could give advice on how this is possible to get those temperature values every so often, that would be great. 如果有人可以就如何经常获得这些温度值提供建议,那将是很好的。
If there are ANY questions, just leave a comment down below and I can provide more information, such as the JS files listed under the web interface if they are needed. 如果有任何问题,请在下面留言,我可以提供更多信息,例如,如果需要,可在Web界面下列出JS文件。
在使用scrapy进行抓取时,我个人使用scrapy splash来呈现javascript: http ://splash.readthedocs.io/en/stable/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.