简体   繁体   English

在C ++中优化IO

[英]Optimizing IO in C++

I'm having trouble optimizing a C++ program for the fastest runtime possible. 我在优化C ++程序以实现最快的运行时遇到了麻烦。

The requirements of the code is to output the absolute value of the difference of 2 long integers, fed through a file into the program. 该代码的要求是输出2个长整数之差的绝对值,并通过文件将其馈送到程序中。 ie: 即:

./myprogram < unkownfilenamefullofdata

The file name is unknown, and has 2 numbers per line, separated by a space. 文件名未知,每行有2个数字,以空格分隔。 There is an unknown amount of test data. 有未知数量的测试数据。 I created 2 files of test data. 我创建了2个测试数据文件。 One has the extreme cases and is 5 runs long. 一种是极端情况,长达5轮。 As for the other, I used a Java program to generate 2,000,000 random numbers, and output that to a timedrun file -- 18.MB worth of tests. 至于另一个,我使用一个Java程序生成2,000,000个随机数,并将其输出到一个timedrun文件中-相当于18.MB的测试。

The massive file runs at 3.4 seconds. 大量文件的运行时间为3.4秒。 I need to break that down to 1.1 seconds. 我需要将其分解为1.1秒。

This is my code: 这是我的代码:

int main() {
long int a, b;
while (scanf("%li %li",&a,&b)>-1){
  if(b>=a)
    printf("%li/n",(b-a));
  else
    printf("%li/n",(a-b));
  } //endwhile
return 0;
}//end main

I ran Valgrind on my program, and it showed that a lot of hold-up was in the read and write portion. 我在程序上运行了Valgrind,它表明在读取和写入部分中有很多阻碍。 How would I rewrite print/scan to the most raw form of C++ if I know that I'm only going to be receiving a number? 如果我知道我只会收到一个数字,该如何将打印/扫描重写为最原始的C ++形式? Is there a way that I can scan the number in as a binary number, and manipulate the data with logical operations to compute the difference? 有没有一种方法可以将数字扫描为二进制数字,并通过逻辑运算处理数据以计算出差异? I was also told to consider writing a buffer, but after ~6 hours of searching the web, and attempting the code, I was unsuccessful. 我还被告知要考虑编写一个缓冲区,但是经过大约6个小时的网上搜索并尝试编写代码,我没有成功。

Any help would be greatly appreciated. 任何帮助将不胜感激。

What you need to do is load the whole string into memory, and then extract the numbers from there, rather than making repeated I/O calls. 您需要做的是将整个字符串加载到内存中,然后从那里提取数字,而不是重复进行I / O调用。 However, what you may well find is that it simply takes a lot of time to load 18MB off the hard drive. 但是,您可能会发现,从硬盘驱动器加载18MB只是花费大量时间。

You can improve greatly on scanf because you can guarantee the format of your file. 您可以在scanf上进行很大的改进,因为可以保证文件的格式。 Since you know exactly what the format is, you don't need as many error checks. 由于您确切地知道格式是什么,因此您不需要进行太多的错误检查。 Also, printf does a conversion on the new line to the appropriate line break for your platform. 另外,printf会在新行上转换为适合您平台的换行符。

I have used code similar to that found in this SPOJ forum post (see nosy's post half-way down the page) to obtain quite large speed-ups in the reading integers area. 我使用了与本SPOJ论坛帖子中类似的代码(请参阅本页面后面的 nosy帖子),以在读取整数区域中获得相当大的加速。 You will need to modify it to deal with negative numbers. 您将需要对其进行修改以处理负数。 Hopefully it will give you some ideas about how to write a faster printf function as well, but I would start with replacing scanf and see how far that gets you. 希望它也会为您提供一些有关如何编写更快的printf函数的想法,但我将从替换scanf开始,看看能为您带来多大的帮助。

As you suggest the problem is reading all these numbers in and converting from text to binary. 正如您所建议的,问题在于读取所有这些数字并将其从文本转换为二进制。

The best improvement would be to write the numbers out from whatever program generates them as binary. 最好的改进是从任何将二进制数生成的程序中写出数字。 This will reduce significantly reduce the amount of data that has to be read from the disk, and slightly reduce the time needed to convert from text to binary. 这将大大减少必须从磁盘读取的数据量,并略微减少从文本转换为二进制文件所需的时间。

You say that 2,000,000 numbers occupy 18MB = 9 bytes per number. 您说2,000,000个数字占用18MB =每个数字9个字节。 This includes the spaces and the end of line markers, so sounds reasonable. 这包括空格和行标记的结尾,因此听起来很合理。

Storing the numbers as 4 byte integers will half the amount of data that must be read from the disk. 将数字存储为4字节整数将减少必须从磁盘读取的数据量的一半。 Along with the saving on format conversion, it would be reasonable to expect a doubling of performance. 在节省格式转换的同时,可以预期性能会提高一倍。

Since you need even more, something more radical is required. 由于您需要更多,因此需要更根本的东西。 You should consider splitting up the data file onto separate files, each on its own disk and then processing each file in its own process. 您应该考虑将数据文件拆分为单独的文件,每个文件都放在自己的磁盘上,然后在自己的进程中处理每个文件。 If you have 4 cores and split the processing up into 4 separate processes and can connect 4 high performace disks, then you might hope for another doubling of the performance. 如果您有4个核心并将处理分成4个独立的进程,并且可以连接4个高性能磁盘,那么您可能希望性能再提高一倍。 The bottleneck is now the OS disk management, and it is impossible to guess how well the OS will manage the four disks in parallel. 现在的瓶颈是操作系统磁盘管理,无法猜测操作系统将如何并行管理四个磁盘。

I assume that this is a grossly simplified model of the processing you need to do. 我假设这是您需要做的处理的大大简化的模型。 If your description is all there is to it, the real solution would be to do the subtraction in the program that writes the test files! 如果您的描述全部包含,那么真正的解决方案是在编写测​​试文件的程序中进行减法!

Even better than opening the file in your program and reading it all at once, would be memory-mapping it. 与在程序中打开文件并一次读取所有文件相比,将文件进行内存映射甚至更好。 ~18MB is no problem for the ~2GB address space available to your program. 对于程序可用的〜2GB地址空间,〜18MB没问题。

Then use strtod to read a number and advance the pointer. 然后使用strtod读取数字并前进指针。

I'd expect a 5-10x speedup compared to input redirection and scanf . 与输入重定向和scanf相比,我期望有5-10倍的加速。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM