简体   繁体   中英

Given the offset of a word in a text file, the java program should retrieve respective line number

I need to extract the whole line in a text that a given offset belongs to. For example:

"Therapist: Okay. {Pause} 
So, how do you feel about -- about this -- about what's going on with your health? 

Participant: I don't like it. 
There's nothing I can do about it.
{Pause}

Therapist: Yeah.\

15-30-28-0140.raw

Therapist: That doesn't sound so good. 
A little bit stressful."

If I ask for the offsetNum=125 the output will be "Participant: I don't like it. " As can be seen, empty lines should be considered.

I wrote the following code that works on some text files but screws up on some others (is unreliable):

 int offset = startingOffset;

                try (LineNumberReader r = new LineNumberReader(new FileReader(Input))) {
                    int count = 0;

                    while (r.read() != -1 && count < offset)
                    {
                        count++;
                    }
                    if (count == offset)
                    {

                          lineNo = r.getLineNumber()
                    }

However, I need a reliable way to get the actual line an not lineNo...

The following method will do what you want.

It counts every character, including CR and LF characters, building up a line of text in the line buffer. At end of each line, it checks if offsetNum was within that line, including first character and newline character, and returns line if it was. Otherwise it clears the line buffer and continue for next line.

Note that if offsetNum is on the LF of a CRLF pair, it will return an empty line, which isn't correct, but I'll let you figure that one out.

private static String readLineAtOffset(String fileName, int offsetNum) throws IOException {
    int count = 0;
    StringBuilder line = new StringBuilder();
    try (BufferedReader reader = Files.newBufferedReader(Paths.get(fileName))) {
        for (int ch; (ch = reader.read()) != -1; count++) {
            if (ch != '\r' && ch != '\n')
                line.append((char)ch);
            else if (count < offsetNum)
                line.setLength(0);
            else
                break;
        }
    }
    return (count >= offsetNum ? line.toString() : null);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM