简体   繁体   English

Java正则表达式从文件中的一行句子中获取特定的字符串

[英]Java Regular Expression to fetch particular String from a line of sentence in a file

I need to read a file and fetch only the file names ending with .csv.我需要读取一个文件并只获取以 .csv 结尾的文件名。 the file will contain several lines like this below该文件将包含如下几行

-dataFileName ABC.csv -command ii
-dataFileName EFG.csv -command ii
-dataFileName HIJ.csv -command ii
-dataFileName MNPQR.csv -command ii
-dataFileName UVXYZ.csv -command ii

We can see that the -dataFileName [ XXXX ] -command ii is kind of repetitive我们可以看到-dataFileName [ XXXX ] -command ii有点重复

I want ABC .csv , EFG .csv , HIJ .csv , MNPQR .csv , UVXYZ .csv ,as my console output.我想要ABC .csv 、 EFG .csv 、 HIJ .csv 、 MNPQR .csv 、 UVXYZ .csv ,作为我的控制台输出。

If you simply want to leverage the repetition of -dataFileName and -command ii in your strings then you can simple do this in Java,如果你只是想在你的字符串中利用 -dataFileName 和 -command ii 的重复,那么你可以在 Java 中简单地做到这一点,

replaceAll("-dataFileName| -command ii", "")

and write code something like this,并编写这样的代码,

public static void main(String args[]) throws Exception {
    List<String> list = Arrays.asList(
            "-dataFileName ABC.csv -command ii",
            "-dataFileName EFG.csv -command ii",
            "-dataFileName HIJ.csv -command ii",
            "-dataFileName MNPQR.csv -command ii",
            "-dataFileName UVXYZ.csv -command ii"
    );

    list.forEach(x -> {System.out.println(x + " --> " + x.replaceAll("-dataFileName| -command ii", ""));});
}

This gives following output,这给出了以下输出,

-dataFileName ABC.csv -command ii -->  ABC.csv
-dataFileName EFG.csv -command ii -->  EFG.csv
-dataFileName HIJ.csv -command ii -->  HIJ.csv
-dataFileName MNPQR.csv -command ii -->  MNPQR.csv
-dataFileName UVXYZ.csv -command ii -->  UVXYZ.csv

If you don't like that, you can use this simple regex to do the job,如果你不喜欢那样,你可以使用这个简单的正则表达式来完成这项工作,

-dataFileName (.*?) -command ii

and capture group 1.并捕获组 1。

Demo演示

I don't see why you want to use a regex for this.我不明白你为什么要为此使用正则表达式。 You can easily write a simple parser for it that won't cause problem when your requirements change (need to handle quotes? easy enough with a parser, messy with a regex).您可以轻松地为它编写一个简单的解析器,当您的需求发生变化时不会引起问题(需要处理引号?使用解析器很容易,使用正则表达式很麻烦)。

An example program that would do this:一个可以执行此操作的示例程序:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import java.util.stream.Stream;

class Scratch {

    private static final String INPUT = "-dataFileName ABC.csv -command ii\n" +
        "-dataFileName EFG.csv -command ii -dataFileName OAZE.csv\n" +
        "-dataFileName HIJ.csv -command ii\n" +
        "-dataFileName MNPQR.csv -command ii\n" +
        "-dataFileName UVXYZ.csv -command ii";

    public static void main(String[] args) throws IOException {
        try (BufferedReader reader = new BufferedReader(new StringReader(INPUT))) {
            reader.lines()
                .flatMap(line -> fetchFilenamesFromArgumentLine(line, "dataFileName", "csv"))
                .forEach(System.out::println);
        }
    }

    public static Stream<String> fetchFilenamesFromArgumentLine(String line, String argumentName, String extension) {
        Stream.Builder<String> resultBuilder = Stream.builder();

        int index = 0;
        String actualArgumentName = "-" + argumentName + " ";

        while ((index = line.indexOf(actualArgumentName, index)) >= 0) {
            int start = index + actualArgumentName.length();
            int end = line.indexOf(extension, start) + extension.length();

            resultBuilder.add(line.substring(start, end));
            index = end;
        }
        return resultBuilder.build();
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM