简体   繁体   English

从 Java 中的文本文件中读取数据为 arrays

[英]Read data as arrays from a Text File in Java

I have a text file with a bunch of arrays with a specific number I have to find in the array.我有一个包含一堆 arrays 的文本文件,我必须在数组中找到一个特定的数字。 The text file looks like this:文本文件如下所示:

(8) {1, 4, 6, 8, 12, 22}
(50) {2, 5, 6, 7, 10, 11, 24, 50, 65}
(1) {1}
(33) {1, 2, 5, 6, 11, 12, 13, 21, 25, 26, 30, 33, 60, 88, 99}
(1) {1, 2, 3, 4, 8, 9, 100}
(1) {2, 3, 5, 6, 11, 12, 13, 21, 25, 26, 30, 33, 60, 88, 99}

where the number inside the parenthesis is the number I have to find using binary search.其中括号内的数字是我必须使用二进制搜索找到的数字。 and the rest is the actual array. rest 是实际阵列。 I do not know how I would get this array from the text file and be able to read it as an actual array.我不知道如何从文本文件中获取该数组并能够将其作为实际数组读取。 [This is a question on a previous coding competition I took, and am going over the problems] [这是我之前参加的编码比赛的一个问题,正在复习这些问题]

I already have a method to do the binary search, and I have used scanner to read the file like this:我已经有了一种方法来进行二进制搜索,并且我使用扫描仪来读取文件,如下所示:

Scanner sc = new Scanner(new File("search_race.dat"));

and used a while loop to be able to loop through the file and read it.并使用了一个while循环来遍历文件并读取它。

But I am stuck on how to make java know that the stuff in the curly braces is an array and the stuff in the parenthesis is what it must use binary search on said array to find.但是我被困在如何让 java 知道花括号中的东西是一个数组,而括号中的东西是它必须在所述数组上使用二进制搜索才能找到的东西。

You could read strings in file line by line and then use regex on each line to separate the string to groups.您可以逐行读取文件中的字符串,然后在每行上使用正则表达式将字符串分成组。

Below regex should fit to match the line下面的正则表达式应该适合匹配该行

\((\d+)\) \{([\d, ]+)\}

Then group(1) will give the digit inside the parentheses (as a String) and group(2) will give the String inside curly braces, which you can split using, and space(assuming every comma follows a space) and get an array of numbers (as Strings again)然后 group(1) 将给出括号内的数字(作为字符串), group(2) 将给出大括号内的字符串,您可以使用它进行拆分,以及空格(假设每个逗号后面都有一个空格)并得到一个数组数字(再次作为字符串)

Hope this helps!希望这可以帮助!

TL;DR: You have to check character by character and see if it's a curly brace or a parenthesis or a digit TL;DR:您必须逐个字符检查它是花括号还是括号还是数字

Long Answer:长答案:
First, create a POJO (let's call this AlgoContainer , but use whatever name you like) with the fields int numberToFind and ArrayList<Integer> listOfNumbers .首先,使用字段int numberToFindArrayList<Integer> listOfNumbers创建一个 POJO(我们称之为AlgoContainer ,但可以使用任何你喜欢的名称)。
Then, read the file like @ManojBanik has mentioned in the comments然后,阅读@ManojBanik 在评论中提到的文件

Now create an ArrayList<AlgoContainer> (it's size should be the same as the ArrayList<String> that was gotten while reading the file line by line)现在创建一个ArrayList<AlgoContainer> (它的大小应该与逐行读取文件时获得的ArrayList<String>相同)

Then loop through the ArrayList<String> in the above step and perform the following operations:然后循环遍历上述步骤中的ArrayList<String>并执行以下操作:

  1. Create and instantiate an AlgoContainer object instance (let's call this tempAlgoContainer ).创建并实例化一个AlgoContainer object 实例(我们称之为tempAlgoContainer )。

  2. check if the first character is an open parentheses -> yes?检查第一个字符是否是左括号-> 是吗? create an empty temp String -> check if the next character is a number -> yes?创建一个空的临时String -> 检查下一个字符是否是数字 -> 是吗? -> append it to the empty String and repeat this until you find the closing parenthesis. -> append 将其转换为空String并重复此操作,直到找到右括号。

  3. Found the open parenthesis?找到左括号? parse the temp String to int and set the numberToFind field of tempAlgoContainer to that number.将 temp String解析为int并将tempAlgoContainernumberToFind字段设置为该数字。

  4. Next up is the curly bracket stuff: found a curly bracket?接下来是花括号的东西:找到花括号了吗? create a new empty temp String -> check if the next character is digit -> yes?创建一个新的空临时String -> 检查下一个字符是否为数字 -> 是吗? append then append it to the empty String just like in step #2 until you find a comma or a closing curly brace. append 然后 append 到空String ,就像在步骤 2 中一样,直到找到逗号或右花括号。

  5. Found a comma?找到逗号? parse the temp String to int and then add it to the listOfNumbers (which is a field) of tempAlgoContainer -> make the temp String empty again.将临时String解析为 int,然后将其添加到tempAlgoContainerlistOfNumbers (这是一个字段)中 -> 再次使临时String为空。
  6. Found a closing curly brace?找到一个闭合花括号? repeat the above step and break out of the loop.重复上述步骤,跳出循环。 You are now ready to process whatever you want to do.您现在已准备好处理您想做的任何事情。 Your data is ready.您的数据已准备就绪。

Also, it's a good idea to have a member function or instance method of AlgoContainer (call it whatever you want) to perform the binary search so that you can simply loop through the ArrayList<AlgoContainer> and call that BS function on it (no-pun-intended)此外,最好有一个成员 function 或 AlgoContainer 的实例方法( AlgoContainer调用它)来执行二进制搜索,以便您可以简单地循环遍历ArrayList<AlgoContainer>并在其上调用 BS function(不 -双关语)

To read the file, you can use Files.readAllLines()要读取文件,可以使用Files.readAllLines()

To actually parse each line, you can use something like this.要实际解析每一行,您可以使用类似这样的东西。

First, to make things easier, remove any whitespace from the line.首先,为了使事情更容易,从行中删除所有空格。

line.replaceAll("\\s+", "");

This will essentially transform (8) {1, 4, 6, 8, 12, 22} into (8){1,4,6,8,12,22} .这基本上会将(8) {1, 4, 6, 8, 12, 22}转换为(8){1,4,6,8,12,22}

Next, use a regular expression to validate the line.接下来,使用正则表达式来验证该行。 If the line does not match no further actions are required.如果该行不匹配,则不需要进一步的操作。

Expression: \([0-9]*\)\{[0-9]*(,[0-9]*)*}表达式: \([0-9]*\)\{[0-9]*(,[0-9]*)*}

  • \([0-9]*\) relates to (8) (above example) \([0-9]*\)(8)相关(上例)
  • \{[0-9]*(,[0-9]*)*} relates to {1,4,6,8,12,22} \{[0-9]*(,[0-9]*)*}{1,4,6,8,12,22}相关

If you don´t understand the expression, head over here .如果你不明白这个表达,请到这里

Finally, we can parse the string into its two components: The number to search for and the int[] with the actual values.最后,我们可以将字符串解析为它的两个组成部分:要搜索的数字和带有实际值的int[]

// start from index one to skip the first bracket
int targetEnd = trimmed.indexOf(')', 1);
String searchString = trimmed.substring(1, targetEnd);
// parsing wont throw an exception, since we checked with the regex its a number
int numberToFind = Integer.parseInt(searchString);

// skip ')' and '{', align to the first value, skip the last '}'
String valuesString = trimmed.substring(targetEnd + 2, trimmed.length() - 1);
// split the array at ',' to get each value as string
int[] values = Arrays.stream(valuesString.split(","))
    .mapToInt(Integer::parseInt).toArray();

With both of these components parsed, you can do the binary search yourself.解析完这两个组件后,您可以自己进行二进制搜索。

Example code asGist on GitHub示例代码作为GitHub 上的 Gist

You could simply parse each line (the number to find and the array) as follow:您可以简单地解析每一行(要查找的数字和数组),如下所示:

while (sc.hasNext()) {
    int numberToFind = Integer.parseInt(sc.next("\\(\\d+\\)").replaceAll("[()]", ""));

    int[] arrayToFindIn  = Arrays.stream(sc.nextLine().split("[ ,{}]"))
                                        .filter(x->!x.isEmpty())
                                        .mapToInt(Integer::parseInt)
                                        .toArray();

    // Apply your binary search ! Craft it by yourself or use a std one like below :
    // int positionInArray = Arrays.binarySearch(arrayToFindIn, numberToFind);
}

If you don't like the replaceAll, you could replace the first line in the loop by the two below:如果您不喜欢 replaceAll,可以将循环中的第一行替换为以下两行:

    String toFindGroup = sc.next("\\(\\d+\\)");
    int numberToFind = Integer.parseInt(toFindGroup.substring(1, toFindGroup.length()-1));

Cheers!干杯!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM