使用 Node JS 從 Telegram Web 頁面中提取所有 HTML 元素

Question

我想從 Telegram web 網站中提取所有 HTML 元素。 我嘗試了所有方法，例如 get、post、get() jquery，來自 Python、JavaScript 的方法，...

但是當他們返回結果時，它是不完整的，並且其中的某些部分丟失了。 我怎樣才能正確地做到這一點？

這是一個返回不完整元素的片段：

fetch("https://web.telegram.org/k/")
  .then(x => x.text())
  .then(y => console.log(y));

Answer 1

試試這個方法

// first install jsdom
// type npm i jsdom in the console.

const jsdom = require("jsdom");
const { JSDOM } = jsdom;

fetch("https://web.telegram.org/k/")
    .then(x => x.text())
    .then(y => {
        const { document } = (new JSDOM(y)).window;
        console.log(document)
});

結帳 jsdom 文檔： https://github.com/jsdom/jsdom

Answer 2

您是否嘗試添加 header:"Application-Type":"text/html"

Answer 3

我了解到 Telegram Web 抓取，我們不能使用傳統的 javascript 代碼或簡單的 Python 庫。 在這種情況下，我們必須使用 Selenium 和 WebDriver，我正在研究它。 任何更好的建議將不勝感激。

使用 Node JS 從 Telegram Web 頁面中提取所有 HTML 元素

問題描述

3 個解決方案

解決方案1
0 2022-08-28 16:12:48

解決方案2
0 2022-08-28 19:38:16

解決方案3
0 已采納 2022-09-01 01:22:55

使用 Node JS 從 Telegram Web 頁面中提取所有 HTML 元素

問題描述

3 個解決方案

解決方案1 0 2022-08-28 16:12:48

解決方案2 0 2022-08-28 19:38:16

解決方案3 0 已采納 2022-09-01 01:22:55

解決方案1
0 2022-08-28 16:12:48

解決方案2
0 2022-08-28 19:38:16

解決方案3
0 已采納 2022-09-01 01:22:55