用正則表達式匹配 3 個字符串

Question

我有以下文本： Invoice n.ro per 006390 BENETTON RUSSIA OOO 2019 0051035408

我需要檢查文本是否包含Invoice和2019 （4 位數字），在這 4 位數字之后還有n位數字，所以我想讀取Invoice名稱並跳過第一行，然后獲取第二行元素，如下所示：


    File file = new File(this.fileName); // creating file object with String path
        final Pattern invoice = Pattern.compile("^Invoice n ([0-9])+$"); // using reg expression to match what we looking for

            PDDocument pdDocument = PDDocument.load(file); // creating PDD object and loading file that already got path
            Splitter splitter = new Splitter(); // splitter that takes care of splitting pages
            PDFTextStripper stripper = new PDFTextStripper(); // stripper strips text and ignore all formatting
            Matcher matcher;
            String resultInvoiceNumber = "";

            List<PDDocument> split = splitter.split(pdDocument); // split method splits into pages;

            for (PDDocument pd : split) { // looping through the list of split pages
                String s = stripper.getText(pd); //  getting text from single page  and assign it to a String for further manipulation

Answer 1

問題已編輯，但對於帶有換行符的原始字符串，您可以匹配n. 然后直到行尾。 然后使用\\R匹配 unicode 換行符序列，匹配 1+ 個水平換行符並匹配數字。

第二行末尾的數字在捕獲組 1 中。

^Invoice n\..*\R\h+[0-9]{4} ([0-9]+)$

正則表達式演示| Java 演示

在 Java 中

String regex = "^Invoice n\\..*\\R\\h+[0-9]{4} ([0-9]+)$";

Answer 2

您可以根據組嘗試類似的操作：

public class RegexpTest {

    public static void main(String[] args) {
        final String input = "Invoice n.ro per 006390 BENETTON RUSSIA OOO 2019 0051035408";
        final Pattern pattern = Pattern.compile("(Invoice)*(\\s*\\d{4}\\s+\\d+\\s*)");

        final Matcher matcher = pattern.matcher(input);
        System.out.println(matcher.find());
        System.out.println(matcher.group());
    }
}

輸出：

true
 2019 0051035408

用正則表達式匹配 3 個字符串

問題描述

2 個解決方案

解決方案1
2 2019-12-06 09:47:44

解決方案2
2 2019-12-06 09:48:01

用正則表達式匹配 3 個字符串

問題描述

2 個解決方案

解決方案1 2 2019-12-06 09:47:44

解決方案2 2 2019-12-06 09:48:01

解決方案1
2 2019-12-06 09:47:44

解決方案2
2 2019-12-06 09:48:01