简体   繁体   English

内容脚本从 HTML 中提取数据?

[英]Content-script extracting data from HTML?

I am trying to parse and extract email data from my Gmail account using a content-script, but since Gmail uses dynamically generated DOM paths the script fails on reload as the path changes.我正在尝试使用内容脚本从我的 Gmail 帐户中解析和提取 email 数据,但是由于Gmail使用动态生成的 DOM 路径,因此脚本在重新加载时会因路径更改而失败。 Here is my code:这是我的代码:

function extractData() {
    var email = document.querySelector("//*[@id=":1u"]/div[2]/table/tbody/tr/td[2]/table[1]/tbody/tr/td/table/tbody/tr[3]/td/table[2]/tbody/tr/td[2]/div/span[3]/span/a");
    email = email.textContent;
    var amount = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(2) > tbody > tr > td:nth-child(2) > div > span:nth-child(4)");
    amount = amount.textContent;
    const regex = /(\$[0-9,]+(\.[0-9]{2})?)/;
    amount = amount.match(regex);
    amount = amount[0].replace('$', '');

    var date = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(1) > tbody > tr > td:nth-child(4) > span > span:nth-child(1)");
    date = date.textContent;
    date = date.split(' ');
    date = date[0];

    console.log(email + amount + date);

} 

How to overcome this, I guess using Regex to extract relevant data from html could be an answer but Regex is above my league.如何克服这个问题,我想使用正则Regexhtml中提取相关数据可能是一个答案,但正则Regex高于我的联盟。 The data I need extracted is like this:我需要提取的数据是这样的:

You received a payment of {$10.00} USD from {NameHere} ({emailHere})

Need to extract data between the curly braces.需要提取花括号之间的数据。

If the email text has always exactly the structure you showed, you can do this:如果 email 文本始终具有您显示的结构,您可以这样做:

 const text = "You received a payment of $10.00 USD from Some Person (mail@mail.com)"; const regex = /You received a payment of (.*) USD from (.*) \((.*)\)/; const matches = text.match(regex); const amount = matches[1]; const name = matches[2]; const email = matches[3]; console.log(amount, name, email);

If the text varies, you can find the parts this way (although I wouldn't recommend to do this, specially for the name):如果文本不同,您可以通过这种方式找到这些部分(尽管我不建议这样做,特别是对于名称):

 let text = "You received a payment of $10.00 USD from Some Person (mail@mail.com). But somemail@hotmail.com also sent $500.00 and then $15 more."; const priceRegex = /\$\d+\.?\d*/g; const nameRegex = /([AZ][az]+ [AZ][az]+)/g; const emailRegex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9]+\.[a-zA-Z0-9._-]{2,4}(\.[a-zA-Z0-9._-]+)?)/g; console.log( text.match(priceRegex) ); console.log( text.match(nameRegex) ); console.log( text.match(emailRegex) );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将数据从内容脚本传递到页面级? - How to pass data from content-script to page-level? 在内容脚本/背景之间发送数据 - Send data between content-script/background 将选项页面中的设置交换为Chrome扩展程序中的内容脚本 - Exchanging Settings from Options Page to Content-Script in Chrome Extension 在单击按钮时从内容脚本打开选项页? - Open options page from content-script on button click? 从内容脚本中读取注入的js脚本中的变量和函数 - Read variable and function from injected js script from content-script 带有CORS的REST无法与WebExtension内容脚本一起使用 - REST with CORS not working with WebExtension content-script 我们如何通过“ on-event”属性在Chrome内容脚本中调用函数 - How do we call a function inside the Chrome content-script from “on-event” attribute 使用Chrome扩展程序内容脚本从localStorage中获取令牌值 - Getting a token value out of localStorage with a Chrome extension content-script Chrome扩展程序:如何在全局范围内捕获/处理内容脚本错误? - Chrome extension: how to trap/handle content-script errors globally? 如何为 URL 中的给定哈希触发 Chrome 扩展程序、内容脚本? - How to trigger a Chrome extension, content-script for a given hash in the URL?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM