简体   繁体   English

如何为该文本文件编写简单的pegjs语法?

[英]How can I write a simple pegjs grammar for this text file?

I just want to segment this text file into lines and to classify the lines. 我只想将此文本文件分割成几行,然后对这些行进行分类。 If the line starts with "Qty" then the next lines are the order items until the line starts with "GST". 如果该行以“数量”开头,则接下来的行是订购商品,直到该行以“ GST”开头。

If the line starts with "Total Amount" Then this is the total amount line. 如果该行以“ Total Amount”开头,则这是总行。

Business me . ' l
Address "rwqagePnnter Pro DemcRafifilp
Address "mfgr Eva|uat|on Only
Contact line 1
Transaction Number 10006
Issue Date 27/02/201
Time 10:36:55
Salesperson orsa orsa
Qty Description Unit Price Total
1 test $120.00 $120.00
GST $10.91
Total Amount $120.00
Cash $120.00
Please contact us for more information about
this receipt.
Thank you for your business.
d
.
test

Please show me how to do with PegJS http://pegjs.majda.cz/ 请教我如何使用PegJS http://pegjs.majda.cz/

Here's a quick and dirty sample solution 这是一个快速而肮脏的样品解决方案

{
  var in_quantity = false // Track whether or not we are in a quantity block
  var quantity    = []
  var gst         = null
  var total       = null
}

start =
  // look for a quantity, then GST, then a total and finally anything else
  (quantity / gst / total / line)+
  {
    return {quantity: quantity, gst: gst, total: total}
  }

chr = [^\n]
eol = "\n"?

quantity   = "Qty" chr+ eol        { in_quantity = true; }
gst        = "GST" g:chr+ eol      { in_quantity = false; gst = g.join('').trim(); }
total      = "Total Amount" t:line { in_quantity = false; total = t.trim(); }

line =
  a:chr+ eol
  {
    if( in_quantity ){
      // break quantities into columns based on tabs
      quantity.push( a.join('').split(/[\t]/) );
    }
    return a.join('');
  }

How about the following code as another solution. 下面的代码如何作为另一个解决方案。

{
  var result = [];
}

start
  = (!QTY AnyLine /
      set:(Quantities TotalAmount)
        {result.push({orders:set[0], total:set[1]})}
    )+ (Chr+)?
  {return result;}

QTY = "Qty"
GST = "GST"

Quantities
  = QtyLine order:(OrderLine*) GSTLine {return order;}

QtyLine
  = QTY Chr* _

OrderLine
  = !GST ch:(Chr+) _ {return ch.join('');}

GSTLine
  = GST Chr* _

TotalAmount
  = "Total Amount" total:(Chr*) _ {return total.join('');}

AnyLine
  = ch:(Chr*) _ {return ch.join('');}

Chr
  = [^\n]
_
  = "\n"

You could use XML, or you could do every line ending with a "/" and then splitting it by them using the split function. 您可以使用XML,也可以执行以“ /”结尾的每一行,然后使用split函数将它们分割。

mytext = mytext.split("/");

And then work with that. 然后处理。 I don't know why you wouldn't just use sql or something similar. 我不知道为什么你不只是使用sql或类似的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM