简体   繁体   中英

Efficient and scalable way to sort large amount of strings in Java

I am looking for some ideas idea on sorting large amount of strings from an input file and print out the sorted results to a new file in Java. The requirement is that the input file could be extremely large. I need to consider the performance in the solution, so any ideas?

External Sorting technique is generally used to sort huge amounts of data. May be this is what you need.

externalsortinginjava is the java library for this.

Is an SQL database available? If you inserted all the data into a table, with the sortable column or section indexed, you may (or may not) be able to output the sorted result more efficiently. This solution may also be helpful if the amount of data, outweighs the amount of RAM available.

It would be interesting to know how large, and what the purpose is.

Break the file into amounts you can read in memory. Sort each amount and write to a file. (If you could fit everything into memory you are done) Merge sort the resulting files into a single sorted file.

You can also do a form of radix sort to improve CPU efficiency, but the main bottleneck is all the re-writing and re-reading you have to do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM