简体   繁体   English

从gmail中提取数据添加到电子表格 - Google应用程序脚本

[英]extract data from gmail add to spreadsheet- Google apps script

I have searched, copied and modified code, and tried to break down what others have done and I still can't get this right. 我搜索,复制和修改了代码,并试图打破其他人所做的事情,但我仍然无法做到这一点。

I have email receipts for an ecommerce webiste, where I am trying to harvest particular details from each email and save to a spreadsheet with a script. 我有一个电子商务网站的电子邮件收据,我试图从每封电子邮件中收集特定的详细信息并保存到带有脚本的电子表格中。

Here is the entire script as I have now. 这是我现在的整个脚本。

function menu(e) {
  var ui = SpreadsheetApp.getUi();
  ui.createMenu('programs')
      .addItem('parse mail', 'grabreceipt')
      .addToUi();
}

function grabreceipt() {

  var ss = SpreadsheetApp.getActiveSheet();
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var s = ss.getSheetByName("Sheet1");
  var threads = GmailApp.search("(subject:order receipt) and (after:2016/12/01)");
  var a=[];
  for (var i = 0; i<threads.length; i++)
  {
    var messages = threads[i].getMessages();

    for (var j=0; j<messages.length; j++)
    {
    var messages = GmailApp.getMessagesForThread(threads[i]);
    for (var j = 0; j < messages.length; j++) {
      a[j]=parseMail(messages[j].getPlainBody());
    }
  }
  var nextRow=s.getDataRange().getLastRow()+1;
  var numRows=a.length;
  var numCols=a[0].length;
  s.getRange(nextRow,1,numRows,numCols).setValues(a);
}

function parseMail(body) {
  var a=[];
  var keystr="Order #,Subtotal:,Shipping:,Total:";
  var keys=keystr.split(",");
  var i,p,r;
  for (i in keys)  {
    //p=keys[i]+(/-?\d+(,\d+)*(\.\d+(e\d+)?)?/);
    p=keys[i]+"[\r\n]*([^\r^\n]*)[\r\n]";
    //p=keys[i]+"[\$]?[\d]+[\.]?[\d]+$";
    r=new RegExp(p,"m");
    try {a[i]=body.match(p)[1];}
    catch (err) {a[i]="no match";}
  }
  return a;
}
}

So the email data to pluck from comes as text only like this: 因此,要发布的电子邮件数据仅作为文本出现,如下所示:

Order #89076 订单#89076
(body content, omitted) (正文内容,略)
Subtotal: $528.31 小计:528.31美元
Shipping: $42.66 via Priority Mail® 运费:通过PriorityMail®$ 42.66
Payment Method: Check Payment- Money order 付款方式:支票付款 - 汇票
Total: $570.97 总计:570.97美元

Note: mywebsite order 456. Customer asked about this and that... etc. 注意:mywebsite订单456.客户询问了这个和那个......等等。

The original code regex was designed to grab content, following the keystr values which were easily found on their own line. 原始代码正则表达式旨在获取内容,遵循可在自己的行中轻松找到的keystr值。 So this made sense: 所以这是有道理的:

p=keys[i]+"[\r\n]*([^\r^\n]*)[\r\n]";

This works okay, but results where the lines include more data that follows as in line Shipping: $42.66 via Priority Mail®. 这样可行,但结果显示这些行包含更多数据,如下所示:通过PriorityMail®获得$ 42.66。

My data is more blended, where I only wish to take numbers, or numbers and decimals. 我的数据更加混合,我只希望获取数字,数字和小数。 So I have this instead which validates on regex101.com 所以我有这个而不是在regex101.com验证

p=keys[i]+"[\$]?[\d]+[\.]?\d+$";

The expression only, [\\$]?[\\d]+[.]?\\d+$ works great but I still get "no match" for each row. 仅表达式,[\\ $]?[\\ d] + [。]?\\ d + $效果很好,但每行仍然“不匹配”。

Additionally, within this search there are 22 threads returned, and it populates 39 rows in the spreadsheet. 此外,在此搜索中,返回了22个线程,并在电子表格中填充了39行。 I can not figure out why 39? 我无法弄清楚为什么39?

The reason for your regex not working like it should is because you are not escaping the "\\" in the string you use to create the regex 你的正则表达式不能正常工作的原因应该是因为你没有转义用于创建正则表达式的字符串中的“\\”

So a regex like this 所以像这样的正则表达式

"\s?\$?(\d+\.?\d+)"

needs to be escaped like so: 需要像这样转义:

"\\s?\\$?(\\d+\\.?\\d+)"

The below code is just modified from your parseEmail() to work as a snippet here. 以下代码只是从您的parseEmail()修改为此处的代码段。 If you copy this to your app script code delete document.getElementById() lines. 如果将此复制到您的应用程序脚本代码中,请删除document.getElementById()行。

Your can try your example in the snippet below it will only give you the numbers. 您可以在下面的代码段中尝试您的示例,它只会为您提供数字。

 function parseMail(body) { if(body == "" || body == undefined){ var body = document.getElementById("input").value } var a=[]; var keystr="Order #,Subtotal:,Shipping:,Total:"; var keys=keystr.split(","); var i,p,r; for (i in keys) { p=keys[i]+"\\\\s?\\\\$?(\\\\d+\\\\.?\\\\d+)"; r=new RegExp(p,"m"); try {a[i]=body.match(p)[1];} catch (err) {a[i]="no match";} } document.getElementById("output").innerHTML = a.join(";") return a; } 
 <textarea id ="input"></textarea> <div id= "output"></div> <input type = "button" value = "Parse" onclick = "parseMail()"> 

Hope that helps 希望有所帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM