简体   繁体   English

文本文件-多行字符串成一行

[英]Text file - multiline string into one line

Im parsing txt file and perform some editing tasks. 我正在解析txt文件并执行一些编辑任务。 Im stuck while changing multi-line string into one line string. 我在将多行字符串更改为一行字符串时卡住了。

Workflow: 1) join multi-lines into one line 2) extract specific lines which contain some char or startsWith 工作流程:1)将多行合并为一行2)提取包含一些char或startsWith的特定行

tried already some methods but without desired results. 已经尝试了一些方法,但是没有理想的结果。

the goal is to have this line: 目标是具有以下行:

Jrn.Directive "WindowSize"  , "[A.rvt]", "Floor Plan: Level 1" , 1912, 849

based on 基于

 Jrn.Directive "WindowSize"  _
         , "[A.rvt]", "Floor Plan: Level 1" _
         , 1912, 849

tried: 尝试:

line.lines().collect(Collectors.joining("_"+"[\n]"));

or 要么

line.replaceAll("  _\n" +
                        "         ,");

Appreciated for any advice Update: 感谢任何建议更新:

Workflow: 工作流程:

  1. text contain following text (it is small portion of whole txt file) - I was not able paste it as a code please see screenshot 文字包含以下文字(这是整个txt文件的一小部分)-我无法将其粘贴为代码,请参见屏幕截图

    Jrn.Directive "WindowSize" _ , "[A.rvt]", "Floor Plan: Level 1" _ , 1912, 849 ' 0:< .Marshalling ' 0:< ...CompactCaching = 1 (Enabled) ' 0:< .ThreadPool ' 0:< ...ActivePoolSize = 51 ' 0:< ...ConfiguredPoolSize = automatic ' 0:< ...ParallelCores = 8 ' 0:< ...RequestedPoolSize = automatic ' 0:< .Tuning ' 0:< ...ElemTable = 1 (Serial except when multithreaded) ' 0:< BC: 0,0,0 Jrn.Directive "WindowSize" _ , "[A.rvt]", "Floor Plan: Level 1" _ , 1912, 84 Jrn.Directive“ WindowSize” _,“ [A.rvt]”,“平面图:级别1” _,1912,849'0:<.Marshalling'0:<... CompactCaching = 1(Enabled)'0: <.ThreadPool'0:<... ActivePoolSize = 51'0:<... ConfiguredPoolSize =自动'0:<... ParallelCores = 8'0:<... RequestedPoolSize =自动'0:<.Tuning' 0:<... ElemTable = 1(多线程时除外)'0:<BC:0,0,0 Jrn.Directive“ WindowSize” _,“ [A.rvt]”,“平面图:1级” _ ,1912,84

Please see screenshot https://i.ibb.co/0cRrwcR/2019-02-03-1947.png 请查看截图https://i.ibb.co/0cRrwcR/2019-02-03-1947.png

  1. Because I will be extracting strings which startsWith Jrn.D etc I need to join this and get 因为我将提取以Jrn.D开头的字符串, 所以我需要将其加入并获取

    Jrn.Directive "WindowSize" , "[A.rvt]", "Floor Plan: Level 1" , 1912, 849 Jrn.Directive“ WindowSize”,“ [A.rvt]”,“平面图:级别1”,1912,849

I think it's necessary first to define which lines need to be joined afterwards I can extract lines which contains interesting information like for example these which starts with Jrn.D . 我认为有必要先定义哪些行需要连接,然后提取包含有趣信息的行,例如以Jrn.D开头的信息。

Code what Im using to find specific stings 编码Im用于查找特定st的内容

import java.io.*;
import java.util.stream.Collectors;
public class ReadFromFile {
    public static void main(String [] args) {
        // The name of the file to open.
        String fileName = "test.txt";

        // This will reference one line at a time
        String line = null;

        try {
            // FileReader reads text files in the default encoding.
            FileReader fileReader =
                    new FileReader(fileName);

            // Always wrap FileReader in BufferedReader.
            BufferedReader bufferedReader =
                    new BufferedReader(fileReader);

            while((line = bufferedReader.readLine()) != null) {

            // Im defining which lines are important for me but firstly I 
            //need have them in one line especially when looking for Jrn
                if (line.startsWith("Jrn")|| 
                line.contains("started recording journal file")|| 
                line.contains("' Build:")|| line.contains("Dim Jrn"))
                System.out.println(line);
            }
            // Always close files.
            bufferedReader.close();
        }
        catch(FileNotFoundException ex) {
            System.out.println(
                    "Unable to open file '" +
                            fileName + "'");
        }
        catch(IOException ex) {
            System.out.println(
                    "Error reading file '"
                            + fileName + "'");
            // Or we could just do this:
            // ex.printStackTrace();
        }

    }
}

The best (least intrusive to the file) way I can think to go about your specific problem is to add a delimiter (*) at the end of the Jrn.Directive meta-information if that's within the realm of possibility, eg: 我认为可以解决您的特定问题的最佳方式(对文件的侵入最少)是在Jrn.Directive元信息的末尾添加定界符(*)(如果位于可能性范围内),例如:

Jrn.Directive "WindowSize" _ , "[A.rvt]", "Floor Plan: Level 1" _ , 1912, 849*

You can then use a loop to serially print each token that does not match the delimiter and break the loop when it does. 然后,您可以使用循环来顺序打印每个与定界符不匹配的令牌,并在中断时中断循环。

Something like this 像这样

    //File object instantiation
    File file = new File("test.txt");

    //Iterator which loops over every line in the file
    Iterator<String> iterator = Files.readAllLines(file.toPath()).iterator();

    //The end delimiter for you Jrn.Directive information
    String delimiter = "*";

    while(iterator.hasNext()) {
            //String to store current line
            String line = iterator.next();
            //Execute if line starts with Jrn.Directive
            if (line.startsWith("Jrn")) {
                //JrnLoop to serialize Jrn.Directive information
                JrnLoop: while(true) {
                    //Splitting and processing each character in the current line
                    for(String token: line.split("")) {
                        //Escape and break the JrnLoop if the current character matches end delimiter
                        if (token.matches(delimiter)) {
                            System.out.println();
                            break JrnLoop;
                        }
                        //Otherwise print the current character
                        System.out.print(token);
                    }
                    //Go to the next line of the Jrn.Directive information
                    line = iterator.next();
                }
            }
            //If the line does not start with Jrn.Directive
            else {
                System.out.println(line);

        }

As to why your Jrn.Directive information is stored in multiple lines in the file, I really don't know 至于为什么您的Jrn.Directive信息存储在文件中的多行中,我真的不知道

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM