如何从我当前在机器上的浏览器中打开的网页中获取文本/html？

Question

I want to do something like this:我想做这样的事情：

from lxml import html
import requests

page = requests.get('https://a-website.com/')

But instead of passing a defined hard-coded url, I would like to get the page that I currently have open in my web browser.但是，我不想传递定义的硬编码 url，而是想获取当前在 Web 浏览器中打开的页面。 ie page = requests.get(whateverisopeninmychrome) .即page = requests.get(whateverisopeninmychrome) 。 For what it's worth, its the text contents of a div that I am specifically looking for.对于它的价值，它是我专门寻找的 div 的文本内容。

Is there any way to do this or is it even possible?有什么办法可以做到这一点，或者甚至有可能吗？ I could not find any other information about pulling the html/contents from a browser that is currently open on your machine.我找不到有关从您计算机上当前打开的浏览器中提取 html/内容的任何其他信息。

Answer 1

No, you can't like this.不，你不能喜欢这个。 One way is to open Developer Console on Google Chrome or Firefox with your website open and use javascript syntax like this:一种方法是在Google Chrome或Firefox上打开开发者控制台，同时打开您的网站并使用如下 javascript 语法：

For searching by className , it will return a list and you will have to get one element:要按 className搜索，它将返回一个列表，您必须获得一个元素：

var html = document.getElementsByClassName('htmlClassHere')[0].innerText

For searching by id , it will return the element and you can use it like this:对于按id搜索，它将返回元素，您可以像这样使用它：

var html = document.getElementsById('htmlIdHere').innerText

如何从我当前在机器上的浏览器中打开的网页中获取文本/html？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-04-14 18:46:08

如何从我当前在机器上的浏览器中打开的网页中获取文本/html？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-04-14 18:46:08

解决方案1
0 已采纳 2019-04-14 18:46:08