简体   繁体   English

在PyStruct中拟合SSVM模型时出现IndexError

[英]IndexError when fitting SSVM model in PyStruct

I'm using the pystruct Python module for a structured learning problem in classifying posts in discussion threads, and I've run into an issue when tying to train the OneSlackSSVM for use with the LinearChainCRF . 我在对讨论线程中的帖子进行分类时使用pystruct Python模块解决了结构化学习问题,并且在尝试培训OneSlackSSVMLinearChainCRF使用时遇到了一个问题。 I'm following the OCR example from the docs , but can't seem to call the .fit() method on the SSVM. 我正在遵循文档中OCR示例 ,但似乎无法在.fit()上调用.fit()方法。 Here is the error I'm getting: 这是我得到的错误:

Traceback (most recent call last):

File "<ipython-input-47-da804d135818>", line 1, in <module>
ssvm.fit(X_train, y_train)

File "/Users/kylefth/anaconda/lib/python2.7/site-  
packages/pystruct/learners/one_slack_ssvm.py", line 429, in fit
joint_feature_gt = self.model.batch_joint_feature(X, Y)

File "/Users/kylefth/anaconda/lib/python2.7/site-       
packages/pystruct/models/base.py", line 40, in batch_joint_feature      
joint_feature_ += self.joint_feature(x, y)

File "/Users/kylefth/anaconda/lib/python2.7/site-    
packages/pystruct/models/graph_crf.py", line 197, in joint_feature
unary_marginals[gx, y] = 1

IndexError: index 7 is out of bounds for axis 1 with size 7

Below is the code I've written. 以下是我编写的代码。 I've tired to structure the data as in the docs example where the overall data structure is a dict with keys for data , labels , and folds . 我已经厌倦了像docs示例中那样构造数据,其中整个数据结构是带有datalabelsfolds键的dict

from pystruct.models import LinearChainCRF
from pystruct.learners import OneSlackSSVM

# Printing out keys of overall data structure
print threads.keys()
>>> ['folds', 'labels', 'data']

# Creating instances of models
crf = LinearChainCRF()
ssvm = OneSlackSSVM(model=crf)

# Splitting up data into training and test sets as in example
X, y, folds = threads['data'], threads['labels'], threads['folds']
X_train, X_test = X[folds == 1], X[folds != 1]
y_train, y_test = y[folds == 1], y[folds != 1]

# Print out dimensions of first element in data and labels
print X[0].shape, y[0].shape
>>> (8, 211), (8,)

# Fitting the ssvm model
ssvm.fit(X_train, y_train)
>>> see error above

Directly after trying to fit the model, I get the above error. 在尝试拟合模型后,我直接得到了以上错误。 All instances of X_train , X_test , y_train , and y_test have 211 columns and all the label dimensions appear to match up with their corresponding training and testing data. X_trainX_testy_trainy_test所有实例都有211列,并且所有标签尺寸似乎都与其相应的训练和测试数据相匹配。 Any help would be greatly appreciated. 任何帮助将不胜感激。

I think everything you are doing is right, this is https://github.com/pystruct/pystruct/issues/114 . 我认为您正在做的所有事情都是正确的,这是https://github.com/pystruct/pystruct/issues/114 Your labels y need to start from 0 to n_labels. 您的标签y必须从0到n_labels开始。 I think yours start at 1. 我想你的从1开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM