简体   繁体   中英

How to use scikit-learn's SVM with histograms as features?

I wish to use scikit-learn's SVM with a chi-squared kernel, as shown here . In this scenario, the kernel is on histograms, which is what my data is represented as. However, I can't find an example of these used with histograms. What is the proper way to do this?

Is the correct approach to just treat the histogram as a vector, where each element in the vector corresponds to a bin of the histogram?

Thank you in advance

There is an example of using an approximate feature map here . It is for the RBF kernel but it works just the same.

The example above uses "pipeline" but you can also just apply the transform to your data before handing it to a linear classifer, as AdditiveChi2Sampler doesn't actually fit to the data in any way.

Keep in mind that this is just and approximation of the kernel map (that I found to work quite well) and if you want to use the exact kernel, you should go with ogrisel's anwser.

sklearn.svm.SVC accepts custom kernels in 2 manners:

  • arbitrary python functions passed as kernel argument to the constructor
  • precomputed kernel matrix passed as first argument to fit and kernel=precomputed in the constructor

The former can be much slower but does not require to allocate the whole kernel matrix in advance (which can be prohibitive for large n_samples ).

The are more details and links to examples in the documentation on custom kernels .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM