[英]Content-script extracting data from HTML?
I am trying to parse and extract email data from my Gmail account using a content-script, but since Gmail
uses dynamically generated DOM paths the script fails on reload as the path changes.我正在尝试使用内容脚本从我的 Gmail 帐户中解析和提取 email 数据,但是由于Gmail
使用动态生成的 DOM 路径,因此脚本在重新加载时会因路径更改而失败。 Here is my code:这是我的代码:
function extractData() {
var email = document.querySelector("//*[@id=":1u"]/div[2]/table/tbody/tr/td[2]/table[1]/tbody/tr/td/table/tbody/tr[3]/td/table[2]/tbody/tr/td[2]/div/span[3]/span/a");
email = email.textContent;
var amount = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(2) > tbody > tr > td:nth-child(2) > div > span:nth-child(4)");
amount = amount.textContent;
const regex = /(\$[0-9,]+(\.[0-9]{2})?)/;
amount = amount.match(regex);
amount = amount[0].replace('$', '');
var date = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(1) > tbody > tr > td:nth-child(4) > span > span:nth-child(1)");
date = date.textContent;
date = date.split(' ');
date = date[0];
console.log(email + amount + date);
}
How to overcome this, I guess using Regex
to extract relevant data from html
could be an answer but Regex
is above my league.如何克服这个问题,我想使用正则Regex
从html
中提取相关数据可能是一个答案,但正则Regex
高于我的联盟。 The data I need extracted is like this:我需要提取的数据是这样的:
You received a payment of {$10.00} USD from {NameHere} ({emailHere})
Need to extract data between the curly braces.需要提取花括号之间的数据。
If the email text has always exactly the structure you showed, you can do this:如果 email 文本始终具有您显示的结构,您可以这样做:
const text = "You received a payment of $10.00 USD from Some Person (mail@mail.com)"; const regex = /You received a payment of (.*) USD from (.*) \((.*)\)/; const matches = text.match(regex); const amount = matches[1]; const name = matches[2]; const email = matches[3]; console.log(amount, name, email);
If the text varies, you can find the parts this way (although I wouldn't recommend to do this, specially for the name):如果文本不同,您可以通过这种方式找到这些部分(尽管我不建议这样做,特别是对于名称):
let text = "You received a payment of $10.00 USD from Some Person (mail@mail.com). But somemail@hotmail.com also sent $500.00 and then $15 more."; const priceRegex = /\$\d+\.?\d*/g; const nameRegex = /([AZ][az]+ [AZ][az]+)/g; const emailRegex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9]+\.[a-zA-Z0-9._-]{2,4}(\.[a-zA-Z0-9._-]+)?)/g; console.log( text.match(priceRegex) ); console.log( text.match(nameRegex) ); console.log( text.match(emailRegex) );
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.