简体   繁体   中英

Regex to read Invoice Line Details from line with space delimiter and spaces in description

To any REGEX Gurus... I am trying to get the specific RegEx to read the values in an invoice line and return them into named groups as follows.

the invoice lines look like

ABC08-388 THIS IS DECSCRIPTION WITH SPACES AND APOSTROPIES 80’s ctn 1 1 0 99.90 99.90 9.99 109.89
1233 ANOTHERLINE W/O APOSTROPHEIES each 100 100 0 1.05 105.00 10.50 115.50
XYZ-1234 ANOTEHR LINE WITH APOSTROPHE’S AND SLASH/S box 1 1 0 8.60 8.60 0.00 8.60

the Separation is

Part Number - From Start of line until the first space 
Description - Everything between Part Number and Box Description
Box Description - From end of Description to next group (Space separator)
Qty Ordered - Integer (Space separator)
Qty Delivered - Integer (Space separator)
Qty Back Order - Integer (Space separator)
Box Cost - Decimal number  (Space separator)
Line Total Ex Tax - Decimal number  (Space separator)
Line Tax -Decimal number  (Space separator)
Line Total Incl Tax EOL

I am looking for something along the lines of - But I just cant get all the Thing working... Please any help will be greatly appreciated

^(?<SupplierPartNumber>([A-Za-z0-9-_]+)) (?<SupplierDescription>([.])).(?<BoxQty>([0-9]+([\,\.][0-9]+)){1}(?<DeliveredQty>([0-9]+([\,\.][0-9]+)){1}(?<OnBackOrder>([0-9]+([\,\.][0-9]+)){1} (?<BoxCost>([0-9]+([\,\.][0-9]+)){1}(?<LineTotalEx>([0-9]+([\,\.][0-9]+)){1}(?<GSTAmount>([0-9]+([\,\.][0-9]+)){1} (?<LineTotalInc>([0-9]+([\,\.][0-9]+)){1}

Take a look at this, hopefully it will be helpful. You might need to edit the individual group contents to use the correct format for each part, but you get the point hopefully.

(?<SupplierPartNumber>^[A-Za-z\d-_]+)\s(?<Description>[a-zA-Z\s\d’\/]+[a-zA-Z])\s(?<BoxQty>\d+)\s(?<DeliveredQty>\d+)\s(?<OnBackOrder>\d+)\s(?<BoxCost>\d+\.\d+)\s(?<LineTotalExTax>\d+\.\d+)\s(?<LineTaxDecimal>\d+.\d+)\s(?<LineTotal>\d+.\d+$)

Breaking above regex down by each requirement so easier to see:

(?<SupplierPartNumber>^[A-Za-z\d-_]+)\s
(?<Description>[a-zA-Z\s\d’\/]+[a-zA-Z])\s
(?<BoxQty>\d+)\s
(?<DeliveredQty>\d+)\s
(?<OnBackOrder>\d+)\s
(?<BoxCost>\d+\.\d+)\s
(?<LineTotalExTax>\d+\.\d+)\s
(?<LineTaxDecimal>\d+.\d+)\s
(?<LineTotal>\d+.\d+$)

Regex Demo to see in action.

You'll notice I've combined the two Descriptions into one in the above solution. It is because it wasn't quite clear to me where the Description finished and Box Description started. Assuming from your examples that Description contains only caps, then the regex could look like:

(?<SupplierPartNumber>^[A-Za-z\d-]+)\s(?<Description>[A-Z\s\d’\/]+[A-Z])\s(?<BoxDescription>[a-zA-Z\s\d’\/]+[a-zA-Z])\s(?<BoxQty>\d+)\s(?<DeliveredQty>\d+)\s(?<OnBackOrder>\d+)\s(?<BoxCost>\d+\.\d+)\s(?<LineTotalExTax>\d+\.\d+)\s(?<LineTaxDecimal>\d+.\d+)\s(?<LineTotal>\d+.\d+$)

(?<SupplierPartNumber>^[A-Za-z\d-]+)\s
(?<Description>[A-Z\s\d’\/]+[A-Z])\s
(?<BoxDescription>[a-zA-Z\s\d’\/]+[a-zA-Z])\s
(?<BoxQty>\d+)\s(?<DeliveredQty>\d+)\s
(?<OnBackOrder>\d+)\s
(?<BoxCost>\d+\.\d+)\s
(?<LineTotalExTax>\d+\.\d+)\s
(?<LineTaxDecimal>\d+.\d+)\s
(?<LineTotal>\d+.\d+$)

Regex Demo for the above case.

You'll know better what the separation is between Description and Box Description, so edit the corresponding groups as required. Let me know if you need any more help with this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM