简体   繁体   English

读取两个文件(文本)并比较公共值并输出字符串?

[英]Read two files (text) and compare for common values and output the string?

Question: I have two files one with list of serial number,items,price, location and other file has items. 问题:我有两个文件,一个包含序列号,项目,价格,位置列表,其他文件包含项目。 So i would like compare two files and printout the number times items are repeated in file1 with serial number. 所以我想比较两个文件并打印输出文件编号在序列号中重复的次数。

Text1 file will have Text1文件将具有

在此处输入图片说明

Text2 file will have Text2文件将具有

在此处输入图片说明

Output should be 输出应为

在此处输入图片说明

So the file1 is not formatted in proper order and file 2 is in order (line by line). 因此,文件1的格式不正确,文件2的格式不正确(逐行)。

Since you have no apparent code or effort put into this, I'll only hint/guide you to some tools you can use. 由于您没有为此投入明显的代码或精力,因此,我仅向您提示/引导您使用一些工具。

For parsing strings: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html 解析字符串: http : //docs.oracle.com/javase/6/docs/api/java/lang/String.html

For reading in from a file: http://www.roseindia.net/java/beginners/java-read-file-line-by-line.shtml 要从文件中读取: http : //www.roseindia.net/java/beginners/java-read-file-line-by-line.shtml

And I would recommend reading file #2 first and saving those values to an arraylist, perhaps, so you can iterate through them later on when you do your searching. 而且我建议您先读取文件#2并将这些值保存到arraylist中,以便以后在搜索时可以对它们进行迭代。

Okay my approach to this would be 好吧,我的方法是

  1. Read in the file1 and file2 into a string 将file1和file2读入字符串
  2. "Split" the string in file 1 as well as file2 based on "," if that is what is being used 如果正在使用“ 1”,则基于“,”“拆分”文件1和文件2中的字符串
  3. Check for the item in every 3rd one so my iteration would iterate +3 every time (You might need to sort if not in order both of these) 在第3个项目中检查项目,以便我的迭代每次都迭代+3(如果没有排序,则可能需要排序)
  4. If found store in an Array,ArrayList etc. Go back to Step 3 if more items present. 如果发现存储在Array,ArrayList等中。如果存在更多项,请返回步骤3。 Else stop 其他地方

Even though your file1 is not well formatted, it's content has some pattern which you can use to read it successfully. 即使您的file1格式不正确,它的内容也具有某种模式,可以用来成功读取它。

For each item, it has all the information (ie serial number, name, price, location) but not in a certain order. 对于每个项目,它具有所有信息(即序列号,名称,价格,位置),但没有特定的顺序。 So, you have pay attention to and use the following patterns while you read each item from the file1 - 因此,从文件1中读取每个项目时,请注意并使用以下模式:

  • Serial number is always a plain integer. 序列号始终是纯整数。
  • Price has that $ and . 价格有$$ . character. 字符。
  • Location is 2-character long, all capital. 位置是2个字符长,全部为大写字母。
  • And name is a string can not be any of the above. 并且name是字符串,不能是以上任何一个。

Such problems are not best solved by monolithic JAVA code. 单一的JAVA代码不能最好地解决这些问题。 If you don't have tool constraint then recommended way to solve it is to import data from file 1 into a database table and then run queries from your program to fetch whatever information you like. 如果您没有工具约束,那么建议的解决方法是将文件1中的数据导入数据库表,然后从程序中运行查询以获取所需的任何信息。 You can easily select serial numbers based on items and group them for count based on location. 您可以轻松地根据项目选择序列号,并根据位置对其进行分组以进行计数。

This approach will ensure that you can keep up with changing requirements and if your files are huge you will have good performance. 这种方法将确保您能够适应不断变化的需求,如果文件很大,您将拥有良好的性能。

I hope you are well versed with SQL and DB tools, so I have not posted any details on them. 我希望您精通SQL和DB工具,因此我没有在上面发布任何详细信息。

Use regex. 使用正则表达式。

Step one, tracing and splitting at [\d,], store results in map
Step two, read in the word from the second file. say it's "pen"
Step three, do regex search "pen" on each string within the map.
Step four, if the above returns true , do something like ([A-Z][A-Z],) on each string within the map.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM