简体   繁体   中英

How can I check for repeating lines in a text file with node.js?

I am creating a node.js program which scans through a log file and outputs information from it to a console.

Sometimes, the log file can contain errors which can repeat basically forever (I'm talking like 20000 times).

I need a way to check if any portion of text is repeated multiple times in the file.

Since I don't know what text I'm looking for, I can't use native JS functions, regex, or stuff like that.

Does anyone know how I could achieve this without using machine learning?

I have not tried anything yet because I have absolutely no clue how this could be achieved.

Break the problem up into multiple steps. Deal with one step at a time. So, for step one, your task is to figure out how to read a file from disk into a variable. Next step: turn that variable into an array. etc.

You can use an algorithm something like this:

  1. Read the log file into memory. (If the log file is too large, or if step 2 will be too large, research breaking up this task into multiple parts)
  2. Turn the log file into an array of discrete pieces of text (therefore, you need to know what separates the discrete pieces of text).
  3. Now you need an (empty) output array.
  4. Loop through your input array and, for each array element, check if it is already in the output array. If not, add it. If yes, do nothing.

At the end, you will have an output array consisting only of unique log entries. Write it out to a file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM