[英]Unix sort command takes much longer depending on where it is executed?! (fastest from ProcessBuilder in program run from IDE, slowest from terminal)
I have a java program that uses ProcessBuilder to call the unix sort command. 我有一个java程序,它使用ProcessBuilder来调用unix sort命令。 When I run this code within my IDE (intelliJ) it only takes about a second to sort 500,000 lines. 当我在我的IDE(intelliJ)中运行此代码时,只需要大约一秒钟来排序500,000行。 When I package it into an executable jar, and run that from the terminal it takes about 10 seconds. 当我将它打包到一个可执行的jar中,并从终端运行它需要大约10秒。 When I run the sort command myself from the terminal, it takes 20 seconds! 当我自己从终端运行sort命令时,需要20秒!
Why the vast difference in performance and any way I can get the jar to execute with the same performance? 为什么性能上的巨大差异以及我可以以相同的性能执行jar的任何方式? Environment is OSX 10.6.8 and java 1.6.0_26. 环境是OSX 10.6.8和java 1.6.0_26。 The bottom of the sort man page says "sort 5.93 November 2004" 排序手册页的底部显示“2004年11月5.93排序”
The command it is executing is: 它正在执行的命令是:
sort -t' ' -k5,5f -k4,4f -k1,1n /path/to/imput/file -o /path/to/output/file
Note that when I run sort from the terminal I need to manually escape the tab delimiter and use the argument -t$'\\t'
instead of the actual tab (which I can pass to ProcessBuilder). 请注意,当我从终端运行sort时,我需要手动转义制表符分隔符并使用参数-t$'\\t'
而不是实际的选项卡(我可以传递给ProcessBuilder)。
Looking as ps
everything seems the same except when run from IDE the sort command has a TTY of ?? 看起来像ps
一切看起来都一样,除非从IDE运行时,sort命令的TTY值为? instead of ttys000--but from this question I don't think that should make a difference. 而不是ttys000 - 但从这个问题我不认为这应该有所作为。 Perhaps BASH is slowing me down? 也许BASH让我放慢脚步? I am running out of ideas and want to close this 20x performance gap! 我的想法已经不多了,想要缩短20倍的性能差距!
I'm going to venture two guesses: 我打算冒两个猜测:
perhaps you are invoking different versions of sort (do a which sort
and use the full absolute path to recompare?) 也许你正在调用不同版本的sort(做一个which sort
并使用完整的绝对路径来重新比较?)
perhaps you are using more complicated locale settings (leading to more complicated character set handling etc.)? 也许你正在使用更复杂的语言环境设置(导致更复杂的字符集处理等)? Try 尝试
export LANG=C sort -t' ' -k5,5f -k4,4f -k1,1n /input/file -o /output/file
to compare 比较
Have a look at this project: http://code.google.com/p/externalsortinginjava/ 看看这个项目: http : //code.google.com/p/externalsortinginjava/
Avoid the need of calling external sort entirely. 避免完全调用外部排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.