简体   繁体   English

硬编码与文件输入的效率

[英]The Efficiency of Hard-Coding vs. File Input

I'm working on a machine learning project in Java which will involve a very large model (the output of a Support Vector Machine, for those of you familiar with that) that will need to be retrieved fairly frequently for use by the end user. 我正在使用Java进行机器学习项目,该项目将涉及一个非常大的模型(支持向量机的输出,对于熟悉的人来说),最终用户需要经常对其进行检索。 The bulk of the model consists of large two-dimensional array of fairly small objects. 该模型的大部分由相当小的物体的大型二维阵列组成。

Unfortunately, I do not know exactly how large the model is going to be (I've been working with benchmark data so far, and the data I'm actually going to be using isn't ready yet), nor do I know the specifications of the machine it will run on, as that is also up in the air. 不幸的是,我不知道模型到底有多大(到目前为止,我一直在使用基准数据,而我实际要使用的数据尚未准备好),我也不知道它将运行的机器的规格,因为它还在空中。

I already have a method to write the model to a file as a string, but the write process takes a great deal of time and the read process takes the better part of a minute. 我已经有一种方法可以将模型作为字符串写入文件,但是写入过程要花费大量时间,而读取过程要花费一分钟的大部分时间。 I'd like to cut down on that time, so I had the either bright or insanely convoluted idea of writing the model to a .java file in such a way that it could be compiled and then run to produce a fully formed model. 我想减少时间,所以我有一个聪明的想法或令人费解的想法,就是以一种可以编译模型然后运行以生成完整模型的方式将模型写入.java文件。

My questions to you are, will storing and compiling the model in Java be significantly faster than reading it from the file, under the assumption that the model is about 1 MB in size? 我想问的是,假设模型的大小约为1 MB,那么用Java存储和编译模型是否比从文件中读取模型快得多? And is there some reason I haven't seen yet that this could be a fantastically stupid idea that I should not pursue under any circumstances? 而且,由于某些原因,我尚未看到这可能是一个非常愚蠢的想法,我在任何情况下都不应追求?

Thank you for any ideas you can give me. 谢谢您能给我的任何想法。

EDIT: apparently trying to automatically write several thousand values into code makes a method that is roughly two orders of magnitude larger than the compiler can handle. 编辑:显然试图自动将数千个值写入代码将使该方法比编译器可以处理的大小大两个数量级。 Ah well, live and learn. 好吧,生活和学习。

您可以考虑为数据创建紧凑的二进制格式,而不是写字符串或Java文件。

The question IMHO is if the reading of the file takes long because of IO or because of computing time (=> CPU). 恕我直言,问题是文件读取是由于IO还是由于计算时间(=> CPU)而花费的时间。 If the later is the case then tough luck. 如果情况如此,那么运气不好。 If your IO (eg hard disc) is the cause then you can compress the file and extract it after/while reading. 如果您的IO(例如硬盘)是原因,那么您可以压缩文件并在读取后/读取时将其提取。 There is (of course) ZIP-support in Java (even for Streams). Java中(当然)有ZIP支持(甚至对于Streams)。

Will storing and compiling the model in Java be significantly faster than reading it from the file ? 用Java存储和编译模型是否比从文件中读取模型快得多?

That depends on the way you fashion your custom datastructure to contain your model. 这取决于您定制样式数据结构以包含模型的方式。

I agree with the answer given above to use a binary input format. 我同意上面给出的使用二进制输入格式的答案。 Let's try optimising that first. 让我们先尝试优化它。 Can you provide some information? 你能提供一些资料吗? ...or have you googled working with binary data? ...或者您用Google搜索二进制数据吗? ...buffering it? ...缓冲吗? etc.? 等等。?

Writing a .java file and compiling it will be quiet interesting... but it is bound to give your issues at some point. 编写.java文件并进行编译会很有趣……但是它一定会在某些时候给您带来问题。 However, I think you will find that it will be slightly slower than an optimised binary format, but faster than text-based input. 但是,我认为您会发现它会比优化的二进制格式稍慢,但比基于文本的输入快。

Also, be very careful for early optimisation. 另外,请提防早期优化。 Usually, "highly-configurable" and "blinding fast" is mutual exclusive. 通常,“高度可配置”和“快速致盲”是互斥的。 Rather, get everything to work first and then use a profiler to optimise the really slow sections of the application. 相反,首先要使所有工作正常,然后使用事件探查器来优化应用程序中非常慢的部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 避免硬编码休眠属性 - Avoid hard-coding Hibernate properties 用Java硬编码Map的替代方法? - Alternative ways for hard-coding Map in Java? 为什么要引用外部资源文件中的字符串,而不是将其硬编码到您的Android XML布局中? - Why reference strings from an external resource file instead of hard-coding it to your Android XML layout? 从工厂请求项目,而无需对每种情况进行硬编码 - Requesting item from Factory without hard-coding every case 如何在不对目录进行硬编码的情况下将某些内容保存到桌面? - How to save something to the desktop without hard-coding the directory? 以尺寸单位硬编码按钮的宽度 - Hard-coding button's width in dimension units JLabel vs.drawString提高效率 - JLabel vs. drawString for efficiency 字典:硬编码与外部档案 - Dictionary: hard-coded vs. external file 如何在不对HTML图像进行硬编码的情况下实现Java Servlet图像的幻灯片显示? - How do I implement a slideshow of images from java servlets without hard-coding the images in my html? 我可以在没有硬编码的情况下在 Windows 上运行 Java 中构建 Linux 路径吗? - Can I build a Linux path in Java running on Windows without hard-coding?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM