简体   繁体   English

[Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='index')] 中没有一个在 [index]

[英]None of [Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='index')] are in the [index]

I got this error while I am trying to run my sequential keras model.我在尝试运行顺序 keras model 时遇到此错误。

Here is my code:这是我的代码:

df = pd.DataFrame()
df['category'] = data['category'].rank(method='dense', ascending=False).astype(int)
df['title'] = data['title'].rank(method='dense', ascending=False).astype(int)
df['description'] = data['description']

x = df.description
y = df.category

SEED = 2000
x_train, x_validation_and_test, y_train, y_validation_and_test = train_test_split(x, y, test_size=.02, random_state=SEED)
x_validation, x_test, y_validation, y_test = train_test_split(x_validation_and_test, y_validation_and_test, test_size=.5, random_state=SEED)

And my model:还有我的 model:

model.fit_generator(generator=batch_generator(x_train_tfidf, y_train, 32),
                        epochs=5, validation_data=(x_validation_tfidf, y_validation),
                        steps_per_epoch=x_train_tfidf.shape[0]/32)

I got this arror at:我在以下位置得到了这个错误:

steps_per_epoch=x_train_tfidf.shape[0]/32

df.info df.info

 <class 'pandas.core.frame.DataFrame'> Int64Index: 994 entries, 0 to 1092 Data columns (total 3 columns): category 994 non-null int32 title 994 non-null int32 description 994 non-null object dtypes: int32(2), object(1) memory usage: 23.3+ KB

df.index df.index

 Int64Index([ 0, 1, 2, 3, 4, 6, 7, 8, 10, 11, ... 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092], dtyp

e='int64', name='index', length=994) e='int64',名称='索引',长度=994)

EDIT: I don't understand if it's from slicing the data not properly or indexing is wrong.编辑:我不明白是因为数据切片不正确还是索引错误。

Added more code:添加了更多代码:

tvec1 = TfidfVectorizer(max_features=100000,ngram_range=(1, 3))
tvec1.fit(x_train)

x_train_tfidf = tvec1.transform(x_train)
x_validation_tfidf = tvec1.transform(x_validation).toarray()

clf = LogisticRegression()
clf.fit(x_train_tfidf, y_train)

clf.score(x_validation_tfidf, y_validation)
clf.score(x_train_tfidf, y_train)

seed = 7
np.random.seed(seed)

Here is my batch_generator:这是我的批处理生成器:

def batch_generator(X_data, y_data, batch_size):
    samples_per_epoch = X_data.shape[0]
    number_of_batches = samples_per_epoch/batch_size
    counter=0
    index = np.arange(np.shape(y_data)[0])
    while 1:
        index_batch = index[batch_size*counter:batch_size*(counter+1)]
        X_batch = X_data[index_batch,:].toarray()
        y_batch = y_data[y_data.index[index_batch]]
        counter += 1
        yield X_batch,y_batch
        if (counter > number_of_batches):
            counter=0

I think your problem may be that your dataframe has indexes with some values missing.我认为您的问题可能是您的 dataframe 的索引缺少一些值。 That is, the length is 994 but you have indexes up to 1092 so some rows are left out.也就是说,长度为 994,但您的索引最多为 1092,因此有些行被遗漏了。 This is probably causing batch_generator to fail when indexing the dataframe.这可能导致在索引 dataframe 时batch_generator失败。

So, before all of your provided code, try using:因此,在您提供的所有代码之前,请尝试使用:

data.reset_index(drop=True)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 KeyError: &quot;[Int64Index([ 12313,\\n, 34534],\\n dtype=&#39;int64&#39;, leng - KeyError: "None of [Int64Index([ 12313,\n , 34534],\n dtype='int64', leng 读取 CSV &amp; Columns - KeyError: “[Int64Index([0, 1, 2, 3], dtype='int64')] 都在 [columns] 中” - Reading CSV & Columns - KeyError: “None of [Int64Index([0, 1, 2, 3], dtype='int64')] are in the [columns]” 关键错误:[Int64Index([…]dtype='int64')] 均不在 [columns] 中 - Key Error: None of [Int64Index([…]dtype='int64')] are in the [columns] KeyError:“[Int64Index dtype=&#39;int64&#39;, length=9313)] 都不在 [columns]” - KeyError: "None of [Int64Index dtype='int64', length=9313)] are in the [columns]" 关键错误:[Int64Index…] dtype='int64] 均不在 [columns] 中 - Key Error: None of [Int64Index…] dtype='int64] are in the [columns] Receiving KeyError: “[Int64Index([ ... dtype=&#39;int64&#39;, length=1323)] 都不在 [columns]” - Receiving KeyError: "None of [Int64Index([ ... dtype='int64', length=1323)] are in the [columns]" Sklearn 错误:[Int64Index([2, 3], dtype=&#39;int64&#39;)] 均不在 [columns] 中 - Sklearn error: None of [Int64Index([2, 3], dtype='int64')] are in the [columns] Python Mlens Ensemble:KeyError:“[Int64Index([... dtype='int64', length=105)] 均不在 [columns] 中” - Python Mlens Ensemble: KeyError: "None of [Int64Index([... dtype='int64', length=105)] are in the [columns]" 将Int64Index更改为Index,将dtype = int64更改为dtype = object - Change Int64Index to Index and dtype=int64 to dtype=object 迭代并更改以熊猫为单位的行的值(错误“ [index]中[Int64Index([10],dtype =&#39;int64&#39;)]都不存在”) - Iterating and changing value of the row in pandas ( Error “None of [Int64Index([10], dtype='int64')] are in the [index]” )
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM