DeepLearning4j k-means 非常慢

Question

I'm trying to use DL4J's K-Means implementation.我正在尝试使用 DL4J 的 K-Means 实现。 I set it up as follows:我设置如下：

int CLUSTERS = 5;
int MAX_ITERATIONS = 300;
String DISTANCE_METRIC = "cosinesimilarity";
KMeansClustering KMEANS = KMeansClustering.setup(CLUSTERS, MAX_ITERATIONS, DISTANCE_METRIC);

My data points are vectors of size 300 (doubles), and my test set is comprised of ~ 100 data points each time (give or take).我的数据点是大小为 300（双倍）的向量，我的测试集每次都包含约 100 个数据点（给予或接受）。 I'm running it on my CPU (4 cores) in a single threaded fashion.我以单线程方式在我的 CPU（4 核）上运行它。

Evaluation takes a very long time (a few seconds per example).评估需要很长时间（每个示例几秒钟）。

I took a peek inside the algorithm's implementation and it looks like its concurrency level is very high - a lot of threads are being created (one per data point, to be exact) and executed in parallel.我看了一下算法的实现，看起来它的并发级别非常高 - 正在创建很多线程（准确地说是每个数据点一个）并并行执行。 Perhaps this is an overkill?也许这是一种矫枉过正？ Is there any way I can control it through configuration?有什么办法可以通过配置来控制它吗？ Other ways to speed it up?其他方法可以加快速度吗？ If not, is there any other fast java-based solution for executing k-means?如果没有，是否还有其他基于 Java 的快速解决方案来执行 k-means？

Answer 1

"DL4J supports GPUs and is compatible with distributed computing software such as Apache Spark and Hadoop." “DL4J 支持 GPU，并兼容分布式计算软件，如 Apache Spark 和 Hadoop。” from https://deeplearning4j.org来自https://deeplearning4j.org

Extra Spark or Hadoop instance might help for scaling performance.额外的 Spark 或 Hadoop 实例可能有助于扩展性能。

DeepLearning4j k-means 非常慢

问题描述

1 个解决方案

解决方案1
0 2018-09-23 20:19:55

DeepLearning4j k-means 非常慢

问题描述

1 个解决方案

解决方案1 0 2018-09-23 20:19:55

解决方案1
0 2018-09-23 20:19:55