简体   繁体   English

如何将文本文件中的字符串分成不同的数组(java)

[英]How to separate string in text file into different array (java)

I have a text file that consist of string. 我有一个包含字符串的文本文件。 What i want to do is to separate the string with "[ham]" and the string with "[spam]" inside to the different array, how can i do that, i think about to use regex to recognize the pattern (ham & spam), but i have no idea to start. 我想做的是将带有[[ham]]的字符串和带有[[spam]“的字符串分隔到另一个数组中,我该怎么做,我想使用正则表达式来识别模式(ham&垃圾邮件),但我不知道要开始。 please help me. 请帮我。

String in text file: 文本文件中的字符串:

good [ham]
very good [ham]
bad [spam]
very bad [spam]
very bad, very bad [spam]

and i want the output to be like this: 我希望输出是这样的:

Ham array: 火腿阵列:

good
very good

Spam array: 垃圾邮件数组:

bad
very bad
very bad, very bad

Help me please. 请帮帮我。

Instead of using array I think you should go for ArrayList 我认为您应该使用ArrayList而不是使用数组

List<String> ham=new ArrayList<String>();
List<String> spam=new ArrayList<String>();
if(line.contains("[ham]"))
   ham.add(line.substring(0,line.indexOf("[ham]")));
if(line.contains("[spam]"))
   spam.add(line.substring(0,line.indexOf("[spam]")));

If you really need do this that way (with regex & array as output) write code like this: 如果您真的需要这样做(以正则表达式和数组作为输出),请编写如下代码:

public class StringResolve {

    public static void main(String[] args) {
        try {
            // read data from some source
            URL exampleTxt = StringResolve.class.getClassLoader().getResource("me/markoutte/sandbox/_25989334/example.txt");
            Path path = Paths.get(exampleTxt.toURI());
            List<String> strings = Files.readAllLines(path, Charset.forName("UTF8"));

            // init all my patterns & arrays
            Pattern ham = getPatternFor("ham");
            List<String> hams = new LinkedList<>();

            Pattern spam = getPatternFor("spam");
            List<String> spams = new LinkedList<>();

            // check all of them
            for (String string : strings) {
                Matcher hamMatcher = ham.matcher(string);
                if (hamMatcher.matches()) {
                    // we choose only text without label here
                    hams.add(hamMatcher.group(1));
                }
                Matcher spamMatcher = spam.matcher(string);
                if (spamMatcher.matches()) {
                    // we choose only text without label here
                    spams.add(spamMatcher.group(1));
                }
            }

            // output data through arrays
            String[] hamArray = hams.toArray(new String[hams.size()]);
            System.out.println("Ham array");
            for (String s : hamArray) {
                System.out.println(s);
            }
            System.out.println();

            String[] spamArray = spams.toArray(new String[spams.size()]);
            System.out.println("Spam array");
            for (String s : spamArray) {
                System.out.println(s);
            }

        } catch (URISyntaxException | IOException e) {
            e.printStackTrace();
        }
    }

    private static Pattern getPatternFor(String label) {
        // Regex pattern for string with same kind: some text [label]
        return Pattern.compile(String.format("(.+?)\\s(\\[%s\\])", label));
    }

}

You can use Paths.get("some/path/to/file") if you need to read it from somewhere in your drive. 如果需要从驱动器中的某处读取它,则可以使用Paths.get("some/path/to/file")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM