简体   繁体   English

从Txt文件读取数据并将其拆分为Java中的段落

[英]Read Data From Txt File And Split It Into Paragraphs In Java

In case that I read data from txt file, how can I split it into paragraphs? 如果我从txt文件中读取数据,该如何将其拆分为多个段落? Since a paragraph may contain more than one sentence, what delimiter I should use? 由于一个段落可能包含多个句子,我应该使用什么定界符?

DATA FROM TXT: 来自TXT的数据:

Java is a computer programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible. Java是一种计算机编程语言,它是并发的,基于类的,面向对象的,专门设计为具有尽可能少的实现依赖性。 It is intended to let application developers "write once, run anywhere" (WORA), meaning that code that runs on one platform does not need to be recompiled to run on another. 它旨在让应用程序开发人员“编写一次,就可以在任何地方运行”(WORA),这意味着在一个平台上运行的代码无需重新编译即可在另一个平台上运行。

Java was originally developed by James Gosling at Sun Microsystems (which has since merged into Oracle Corporation) and released in 1995 as a core component of Sun Microsystems' Java platform. Java最初由Sun Microsystems的James Gosling开发(此后已合并为Oracle Corporation),并于1995年作为Sun Microsystems的Java平台的核心组件发布。 The language derives much of its syntax from C and C++, but it has fewer low-level facilities than either of them. 该语言的大部分语法均来自C和C ++,但与任何一种相比,它的低级功能都更少。

EXPECTED OUTPUT: 预期的输出:

Paragraph 1 = Java is a computer programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible. 1 = Java是一种并行的,基于类的,面向对象的计算机编程语言,专门设计为具有尽可能少的实现依赖项。 It is intended to let application developers "write once, run anywhere" (WORA), meaning that code that runs on one platform does not need to be recompiled to run on another. 它旨在让应用程序开发人员“编写一次,就可以在任何地方运行”(WORA),这意味着在一个平台上运行的代码无需重新编译即可在另一个平台上运行。

Paragraph 2 = Java was originally developed by James Gosling at Sun Microsystems (which has since merged into Oracle Corporation) and released in 1995 as a core component of Sun Microsystems' Java platform. 第2段= Java最初由Sun Microsystems的James Gosling开发(此后已合并为Oracle Corporation),并于1995年作为Sun Microsystems Java平台的核心组件发布。 The language derives much of its syntax from C and C++, but it has fewer low-level facilities than either of them. 该语言的大部分语法均来自C和C ++,但与任何一种相比,它的低级功能都更少。

I'm sorry for poor English. 对不起,英语不好。 Thanks for your response. 感谢您的答复。

When you detect end of paragraph you should add 当您检测到段落结尾时,应添加

System.getProperty("line.separator");

And at the begening of each paragraph add 并在每段开始时添加

("\t")

So simply it might looks using string builder. 因此,使用字符串生成器看起来很简单。

StringBuilder builder = new StringBuilder();
builder.append("\t"); //begening
builder.append(PARAGRAPH_CONTENT);
builder.append(System.getProperty("line.separator"); //end of paragraph
System.out.println(builder.toString());

Your question is not much clear, but may be this is what you want : 您的问题不是很清楚,但是可能是您想要的:

try
    {
        BufferedReader bf = new BufferedReader(new FileReader("C:\\fileName.txt"));
        String line = bf.readLine();
        while (line != null)
        {
            System.out.println(line);
            line = bf.readLine();
        }

    }
    catch (Exception e)
    {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM