I am currently trying to run LibSVM
located here: https://www.csie.ntu.edu.tw/~cjlin/libsvm
I only have access to MATLAB 2011b. When I try to run the example data file (heartscale) included with the LibSVM
package with different C
and gamma
values I get the same accuracy results.
This happens for other data sets as well.
I build a for
loop and loop through the different C
and gamma
values and the accuracy %'s do not change.
I am doing this to find the best C
and gamma
to use for the data set (cross-validation) as recommended in the documentation "A Practical Guide to Support Vector Classification" located on the above website.
When I look at the accuracy_mat
that I build below, the values are all the same. Even the outputs from svmpredict
are the same.
I have read through the documentation multiple times and looked at the FAQ on the website and would appreciate inputs on this from SVM-practitioners.
[heart_scale_label, heart_scale_inst] = libsvmread( 'heartscale' );
C = { '2^-5','2^-3','2^-1'};
g = {'2^-15','2^-3','2^-1'};
accuracy_mat = zeros( length( g ), length( c ) );
data_num = length( heart_scale_inst(:,1) );
t = zeros( data_num, 1 );
for i = 1:length( g )
for j = 1:length( C )
c_train_inputs = ['-c ', C{j}];
g_train_inputs = ['-g ', g{i}];
c_and_g_inputs = [c_train_inputs, g_train_inputs];
model = svmtrain( heart_scale_label, ...
heart_scale_inst, ...
[c_and_g_inputs, '-b 1'] ...
);
[predict_label, ...
accuracy, ...
prob_estimates] = svmpredict( heart_scale_label, ...
heart_scale_inst, ...
model, ...
'-b 1' ...
);
accuracy_mat(i,j) = max( accuracy );
end
end
[C,gamma]
hyper-parameters locked you in a corner Support Vector methods are very powerful engines.
Still, one may still destroy their cool predictive powers, either by a poor data-sanitization ( regularisation, NaN removal etc. ) or by sending an order to use a corner-case hyperparameter(s) C
or gamma
.
Before a SVM/SVC engines are put into hard work, the more if a brute-force hyper-parameter space search is planned to be run ( GridSearchCV
et al ), where CPUs/GPUs may easily spend more than hundreds of hours, a simple rule of thumb ought to be use for a pre-validation of the search-space.
Andreas Mueller has put that nicely to first scan SVM rbf
's ( while the pre-scan idea is valid in general, not only for an rbf
model ) over a "rule-of-thumb" range of values:
{'C': np.logspace(-3, 2, 6), 'gamma': np.logspace(-3, 2, 6)}
ie unless you are pretty sure ( or forbidden by some untold restriction ) to use just your ultra-low learning parametrisation values preset in [ C, gamma ]
- search space, you might allow to relax the range so as to allow the SVM-learner progress to some other results farther from the observed corner it has got locked so far.
C = { 0.001, 0.01, 0.1, 1.0, 10.0, 100.0 }
g = { 0.001, 0.01, 0.1, 1.0, 10.0, 100.0 }
If you do not see any evolution of the SVM-learner results over this sand-box pre-test landscape of it's hyper-parameters, than the root cause would be hidden in DataSET
( which does not seem to be the case, as you posted an observation, that the same trouble appeared independently of one particular DataSET under review ).
Nota bene: one might also rather test descriptive statistical values about the trained model:
model_predictions_accuracy_mean(i,j) = mean( accuracy );
model_predictions_accuracy_var( i,j) = var( accuracy );
accuracy_mat( i,j) = max( accuracy ); %% MAX masks
%% quality-of-fit
%% & may "look" same
%% for the whole range
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.