简体   繁体   中英

Convergence Warning Linear SVC — increase the number of iterations?

Here's the error I've been getting:

ConvergenceWarning: Liblinear failed to converge, increase the number of iterations. warnings.warn("Liblinear failed to converge, increase "

I've been working with the romance and news categories in the brown dataset from nltk.corpus, and so far haven't had any issues up until this point. Here's the code that I'm putting in:

 import nltk from nltk.corpus import brown from nltk import pos_tag_sents import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline import sklearn for cat in brown.categories(): print(cat) news_sent = brown.sents(categories=["news"]) romance_sent = brown.sents(categories=["romance"]) ndf = pd.DataFrame({'label':'news', 'sentence':news_sent}) rdf = pd.DataFrame({'label':'romance', 'sentence':romance_sent}) df = pd.concat([ndf, rdf]) df.head() df['label'].value_counts() fig, ax = plt.subplots() _ = df['label'].value_counts().plot.bar(ax=ax, rot=0) fig.savefig("categories_counts.png", bbox_inches = 'tight', pad_inches = 0) pos_all = pos_tag_sents(df['sentence']) def countPOS(pos_tag_sent, POS): pos_count = 0 all_pos_counts = [] for sentence in pos_tag_sent: for word in sentence: tag = word[1] if tag [:2] == POS: pos_count = pos_count+1 all_pos_counts.append(pos_count) pos_count = 0 return all_pos_counts df['NN'] = countPOS(pos_all, 'NN') df['JJ'] = countPOS(pos_all, 'JJ') df.groupby('label').sum() df.to_csv("df_news_romance.csv", index=False) df = pd.read_csv("df_news_romance.csv") fv = df[["NN", "JJ"]] df['label'].value_counts() from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(fv, df['label'], stratify=df['label'], test_size=0.25, random_state = 42) print(X_train.shape) print(X_test.shape) from sklearn.svm import LinearSVC classifier = LinearSVC() classifier.fit(X_train, y_train)

At this point, I get the error above. To add more information from the original post, I've tried things like increasing max_iter and adding LinearSVC(dual=False), but no improvement. Any help would be appreciated!

You may need to set LinearSVC(dual=False) incase the number of samples in your data is more than the number of features. The original config of LinearSVC sets dual to True because its for solving the dual problem. Also you could try increasing the number of max iterations (eg max_iter=10000 ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM