简体   繁体   中英

how can fixed parameters cost and gamma using libsvm matlab to improve accuracy?

I use libsvm to classify a data base that contain 1000 labels. I am new in libsvm and I found a problem to choose the parameters c and g to improve performance. First, here is the program that I use to set the parameters:

  bestcv = 0;
for log2c = -1:3,
 for log2g = -4:1,
  cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
  cv = svmtrain(yapp, xapp, cmd);
  if (cv >= bestcv),
   bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
 end
  fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
end
end

as a result, this program gives c = 8 and g = 2 and when I use these values c and g, I found an accuracy rate of 55%. for classification, I use svm one against all.

 numLabels=max(yapp);
 numTest=size(ytest,1);

   %# train one-against-all models
    model = cell(numLabels,1);
   for k=1:numLabels
      model{k} = svmtrain(double(yapp==k),xapp, '  -c 1000 -g 10 -b 1 ');
    end

 %# get probability estimates of test instances using each model
 prob_black = zeros(numTest,numLabels);
   for k=1:numLabels
     [~,~,p] = svmpredict(double(ytest==k), xtest, model{k}, '-b 1');
     prob_black(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
   end

 %# predict the class with the highest probability
 [~,pred_black] = max(prob_black,[],2);
 acc = sum(pred_black == ytest) ./ numel(ytest)    %# accuracy

The problem is that I need to change these parameters to increase performance. for example, when I put randomly c = 10000 and g = 100, I found a better accuracy rate: 70%.

Please I need help, how can I set theses parameters ( c and g) so to find the optimum accuracy rate? thank you in advance

Hyperparameter tuning is a nontrivial problem in machine learning. The simplest approach is what you've already implemented: define a grid of values, and compute the model on the grid until you find some optimal combination. A key assumption is that the grid itself is a good approximation of the surface: that it's fine enough to not miss anything important, but not so fine that you waste time computing values that are essentially the same as neighboring values. I'm not aware of any method to, in general, know ahead of time how fine a grid is necessary. As illustration: imagine that the global optimum is at $(5,5)$ and the function is basically flat elsewhere. If your grid is $(0,0),(0,10),(10,10),(0,10)$, you'll miss the optimum completely. Likewise, if the grid is $(0,0), (-10,-10),(-10,0),(0,-10)$, you'll never be anywhere near the optimum. In both cases, you have no hope of finding the optimum itself.

Some rules of thumb exist for SVM with RBF kernels, though: a grid of $\\gamma\\in\\{2^{-15},2^{-14},...,2^5\\}$ and $C \\in \\{2^{-5}, 2^{-4},...,2^{15}\\}$ is one such recommendation.

If you found a better solution outside of the range of grid values that you tested, this suggests you should define a larger grid. But larger grids take more time to evaluate, so you'll either have to commit to waiting a while for your results, or move to a more efficient method of exploring the hyperparameter space.

Another alternative is random search: define a "budget" of the number of SVMs that you want to try out, and generate that many random tuples to test. This approach is mostly just useful for benchmarking purposes, since it's entirely unintelligent.

Both grid search and random search have the advantage of being stupidly easy to implement in parallel.

Better options fall in the domain of global optimization. Marc Claeson et al have devised the Optunity package, which uses particle swarm optimization. My research focuses on refinements of the Efficient Global Optimization algorithm ( EGO ), which builds up a Gaussian process as an approximation of the hyperparameter response surface and uses that to make educated predictions about which hyperparameter tuples are most likely to improve upon the current best estimate.

Imagine that you've evaluated the SVM at some hyperparameter tuple $(\\gamma, C)$ and it has some out-of-sample performance metric $y$. An advantage to EGO-inspired methods is that it assumes that the values $y^*$ nearby $(\\gamma,C)$ will be "close" to $y$, so we don't necessarily need to spend time exploring those tuples nearby, especially if $y-y_{min}$ is very large (where $y_{min}$ is the smallest $y$ value we've discovered). EGO will identify and evaluate the SVM at points where it estimates there is a high probability of improvement, so it will intelligently move through the hyper-parameter space: in the ideal case, it will skip over regions of low performance in favor of focusing on regions of high performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM