简体   繁体   English

在sci-kit learning中使用libSVM或在R中使用e1070进行培训与使用支持向量机有什么区别?

[英]What's the difference between using libSVM in sci-kit learn, or e1070 in R, for training and using support vector machines?

Recently I was contemplating the choice of using either R or Python to train support vector machines. 最近,我正在考虑使用R或Python来训练支持向量机的选择。

Aside from the particular strengths and weaknesses intrinsic to both programming languages, I'm wondering if there is any heuristic guidelines for making a decision on which way to go, based on the packages themselves. 除了两种编程语言固有的优点和缺点之外,我想知道是否有任何启发式指南可根据软件包本身来决定采用哪种方法。

I'm thinking in terms of speed of training a model, scalability, availability of different kernels, and other such performance-related aspects. 我在考虑模型的训练速度,可伸缩性,不同内核的可用性以及其他与性能相关的方面。

Given some data sets of different sizes, how could one decide which path to take? 给定一些不同大小的数据集,如何决定采用哪条路径?

I apologize in advance for such a possibly vague question. 对于这个可能含糊的问题,我预先表示歉意。

I do not have experiece with e1070, however from googling it it seems that it either uses or is based on LIBSVM (I don't know enough R to determine which from the cran entry). 我对e1070并没有经验,但是从谷歌搜索来看,它似乎是在使用还是基于LIBSVM(我不知道足够多的R来确定cran条目中的哪个)。 Scilearnkit also uses LIBSVM. Scilearnkit还使用LIBSVM。

In both cases the model is going to be trained by LIBSVM. 在这两种情况下,模型都将由LIBSVM进行训练。 Speed, scalability, variety of options available is going to be exactly the same, and in using SVMs with these libraries the main limitations you will face are the limitations of LIBSVM. 速度,可伸缩性,可用的各种选项将完全相同,并且在将SVM与这些库一起使用时,您将面临的主要限制是LIBSVM的限制。

I think that giving further advice is going to be difficult unless you clarify a couple of things in your question: what is your objective? 我认为,除非您在问题中澄清两点,否则很难提供进一步的建议:您的目标是什么? Do you already know LIBSVM? 您已经知道LIBSVM吗? Is this a learning project? 这是一个学习项目吗? Who is paying for your time? 谁在为您付出时间? Do you feel more comfortable in Python or in R? 您对Python还是R感到更舒服?

Sometime back I had the same question. 有时我有同样的问题。 Yes, both e1070 and scikit-learn use LIBSVM. 是的,e1070和scikit-learn都使用LIBSVM。 I have experience with e1070 only. 我只有e1070的经验。

But there are some areas where R is better . 但是有些地方R 更好 I have read in the past that Python does not handle categorical features properly (at least not right out of the box). 过去我曾读过Python无法正确处理分类功能(至少不是开箱即用)。 This could be a big deal for some. 对于某些人来说,这可能是一件大事。

I also prefer R's formula interface. 我也更喜欢R的formula界面。 And some of the nice data manipulation packages. 还有一些不错的数据处理包。

Python is definitely better for general purpose programming and scikit-learn aids in using a single programming language for all tasks. Python对于通用编程绝对是更好的选择,而scikit-learn帮助将单一编程语言用于所有任务。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM