简体   繁体   English

从 txt 文件中删除一行后的单词被读取

[英]Remove word after line from txt file is read

I have this code which is used to read lines from a file and insert it into Postgre:我有这段代码用于从文件中读取行并将其插入 Postgre:

try {
            BufferedReader reader;
            try {
                reader = new BufferedReader(new FileReader(
                        "C:\\in_progress\\test.txt"));
                String line = reader.readLine();
                while (line != null) {
                    System.out.println(line);

                    Thread.sleep(100);
                    Optional<ProcessedWords> isFound = processedWordsService.findByKeyword(line);

                    if(!isFound.isPresent()){
                        ProcessedWords obj = ProcessedWords.builder()
                                .keyword(line)
                                .createdAt(LocalDateTime.now())
                                .build();
                        processedWordsService.save(obj);
                    }

                    // read next line
                    line = reader.readLine();
                }
                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        catch (Exception e) {
            e.printStackTrace();
        }

How I can remove a line from the file after the line is inserted into SQL database?将行插入 SQL 数据库后,如何从文件中删除一行?

For Reference以供参考

import java.io.*;

public class RemoveLinesFromAfterProcessed {
    public static void main(String[] args) throws Exception {
        String fileName = "TestFile.txt";
        String tempFileName = "tempFile";

        File mainFile = new File(fileName);
        File tempFile = new File(tempFileName);

        try (BufferedReader br = new BufferedReader(new FileReader(mainFile));
             PrintWriter pw = new PrintWriter(new FileWriter(tempFile))
        ) {
            String line;
            while ((line = br.readLine()) != null) {
                if (toProcess(line)) {  // #1
                    // process the code and add it to DB
                    // ignore the line (i.e, not add to temp file)
                } else {
                    // add to temp file.
                    pw.write(line + "\n");  // #2
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

        // delete the old file
        boolean hasDeleted = mainFile.delete();  // #3
        if (!hasDeleted) {
            throw new Exception("Can't delete file!");
        }
        boolean hasRenamed = tempFile.renameTo(mainFile);  // #4
        if (!hasRenamed) {
            throw new Exception("Can't rename file!");
        }

        System.out.println("Done!");
    }

    private static boolean toProcess(String line) {
        // any condition
        // sample condition for example
        return line.contains("aa");
    }
}

Read the file.阅读文件。
1: The condition to decide whether to delete the line or to retain it. 1:决定是删除还是保留该行的条件。
2: Write those line which you don't want to delete into the temporary file. 2:将不想删除的行写入临时文件。
3: Delete the original file. 3:删除原文件。
4: Rename the temporary file to original file name. 4:将临时文件重命名为原始文件名。

The basic idea is the same as what @Shiva Rahul said in his answer.基本思想与@Shiva Rahul 在回答中所说的相同。


However another approach can be , store all the line numbers you want to delete in a list .但是另一种方法是,将所有要删除的行号存储在一个list After you have all the required line numbers that you want to delete you can use LineNumberReader to check and duplicate your main file.在您拥有要删除的所有必需的行号后,您可以使用LineNumberReader检查和复制您的主文件。

Mostly I have used this technique in batch-insert where I was unsure how many lines may have a particular file plus before removal of lines had to do lot of processing.大多数情况下,我在批量插入中使用了这种技术,我不确定有多少行可能有一个特定的文件,加上在删除行之前必须进行大量处理。 It may not be suitable for your case ,just posting the suggestion here if any one bumps to this thread.它可能不适合您的情况,如果有人碰到此线程,请在此处发布建议。

private void deleteLines(String inputFilePath,String outputDirectory,List<Integer> lineNumbers) throws IOException{
    File tempFile = new File("temp.txt");
    File inputFile = new File(inputFilePath);

    // using LineNumberReader we can fetch the line numbers of each line
    LineNumberReader lineReader = new LineNumberReader(new FileReader(inputFile));

    //writter for writing the lines into new file
    BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(tempFile));
    String currentLine;
    while((currentLine = lineReader.readLine()) != null){

        //if current line number is present in removeList then put empty line in new file
        if(lineNumbers.contains(lineReader.getLineNumber())){
            currentLine="";
        }
        bufferedWriter.write(currentLine + System.getProperty("line.separator"));
    }
    //closing statements
    bufferedWriter.close();
    lineReader.close();

    //delete the main file and rename the tempfile to original file Name
    boolean delete = inputFile.delete();
    //boolean b = tempFile.renameTo(inputFile); // use this to save the temp file in same directory;
    boolean b = tempFile.renameTo(new File(outputDirectory+inputFile.getName()));
}

To use this function all you have to do is gather all the required line numbers.要使用此功能,您只需收集所有需要的行号。 inputFilePath is the path of the source file and outputDirectory is where I want store the file after processing. inputFilePath是源文件的路径,而outputDirectory是我要在处理后存储文件的位置。

The issues with the current code:当前代码的问题:

  • Adhere to the Single responsibility principle .坚持单一职责原则 Your code is doing too many things: reads from a file, performs findByKeyword() call, prepares the data and hands it out to store in the database.你的代码做了太多的事情:从文件中读取,执行findByKeyword()调用,准备数据并将其分发到数据库中。 It's hardly can be thoroughly tested , and it's very difficult to maintain .它几乎无法彻底测试,并且很难维护
  • Always use try-with-recourses to get your recourses closed at any circumstances.在任何情况下,始终使用try-with-recourses来关闭您的资源。
  • Don't catch the general Exception type - your code should only catch thous exceptions, which are more or less expected and for which there's a clear scenario on how to handle them.不要捕获一般的Exception类型 - 您的代码应该只捕获 thou 异常,这些异常或多或少是预期的,并且对于如何处理它们有明确的方案。 But don't catch all the exceptions.但不要捕获所有异常。

How I can remove a line from the file after the line is inserted into SQL database?将行插入 SQL 数据库后,如何从文件删除一行

It is not possible to remove a line from a file in the literal sense.从字面上看,不可能从文件中删除一行。 You can override the contents of the file or replace it with another file.您可以覆盖文件的内容或将其替换为另一个文件。

My advice would be to file data in memory, process it, and then write the lines which should be retained into the same file ( ie override the file contents ).我的建议是将数据归档到内存中,对其进行处理,然后将应该保留的行写入同一个文件中(即覆盖文件内容)。

You can argue that the file is huge and dumping it into memory would result in an OutOfMemoryError .您可以争辩说该文件很大,将其转储到内存中会导致OutOfMemoryError And you want to read a line from a file, process it somehow, then store the processed data into the database and then write the line into a file... So that everything is done line by line, all actions in one go for a single line, and as a consequence all the code is crammed in one method.您想从文件中读取一行,以某种方式对其进行处理,然后将处理后的数据存储到数据库中,然后将该行写入文件中......这样一切都逐行完成,所有操作一次完成单行,因此所有代码都挤在一种方法中。 I hope that's not the case because otherwise it's a clear XY-problem .我希望情况并非如此,否则这是一个明显的XY-problem

Firstly, File System isn't a reliable mean of storing data, and it's not very fast.首先,文件系统不是一种可靠的数据存储方式,而且速度不是很快。 If the file is massive, then reading and writing it will a take a considerable amount of time, and it's done just it in order to use a tinny bit of information then this approach is wrong - this information should be stored and structured differently (ie consider placing into a DB) so that it would be possible to retrieve the required data, and there would be no problem with removing entries that are no longer needed.如果文件很大,那么读取和写入它将花费大量时间,并且只是为了使用少量信息而完成它,那么这种方法是错误的 - 这些信息应该以不同的方式存储和构造(即考虑放入数据库),以便可以检索所需的数据,并且删除不再需要的条目不会有问题。

But if the file is lean, and it doesn't contain critical data.但是如果文件很精简,并且不包含关键数据。 Then it's totally fine, I will proceed assuming that it's the case.那么它完全没问题,我会继续假设它是这种情况。

The overall approach is to generate a map Map<String, Optional<ProcessedWords>> based on the file contents, process the non-empty optionals and prepare a list of lines to override the previous file content.总体方法是根据文件内容生成一个映射Map<String, Optional<ProcessedWords>> ,处理非空的可选项并准备一个行列表来覆盖之前的文件内容。

The code below is based on the NIO2 file system API.下面的代码基于 NIO2 文件系统 API。

public void readProcessAndRemove(ProcessedWordsService service, Path path) {
    
    Map<String, Optional<ProcessedWords>> result;
    
    try (var lines = Files.lines(path)) {
        result = processLines(service, lines);
    } catch (IOException e) {
        result = Collections.emptyMap();
        logger.log();
        e.printStackTrace();
    }
    
    List<String> linesToRetain = prepareAndSave(service, result);
    writeToFile(linesToRetain, path);
}

Processing the stream of lines from a file returned Files.lines() :处理来自文件的行流返回Files.lines()

private static Map<String, Optional<ProcessedWords>> processLines(ProcessedWordsService service,
                                                                  Stream<String> lines) {
    return lines.collect(Collectors.toMap(
        Function.identity(),
        service::findByKeyword
    ));
}

Saving the words for which findByKeyword() returned an empty optional:保存findByKeyword()返回空选项的单词:

private static List<String> prepareAndSave(ProcessedWordsService service,
                                           Map<String, Optional<ProcessedWords>> wordByLine) {
    wordByLine.forEach((k, v) -> {
        if (v.isEmpty()) saveWord(service, k);
    });
    
    return getLinesToRetain(wordByLine);
}

private static void saveWord(ProcessedWordsService service, String line) {
    
    ProcessedWords obj = ProcessedWords.builder()
        .keyword(line)
        .createdAt(LocalDateTime.now())
        .build();
    service.save(obj);
}

Generating a list of lines to retain:生成要保留的行列表:

private static List<String> getLinesToRetain(Map<String, Optional<ProcessedWords>> wordByLine) {
    
    return wordByLine.entrySet().stream()
        .filter(entry -> entry.getValue().isPresent())
        .map(Map.Entry::getKey)
        .collect(Collectors.toList());
}

Overriding the file contents using Files.write() .使用Files.write()覆盖文件内容。 Note: since varargs OpenOption isn't provided with any arguments, this call would be treated as if the CREATE , TRUNCATE_EXISTING , and WRITE options are present.注意:由于没有为可变参数OpenOption提供任何参数,因此该调用将被视为存在CREATETRUNCATE_EXISTINGWRITE选项。

private static void writeToFile(List<String> lines, Path path) {
    try {
        Files.write(path, lines);
    } catch (IOException e) {
        logger.log();
        e.printStackTrace();
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM