简体   繁体   English

KeyError:“[Int64Index dtype='int64', length=9313)] 都不在 [columns]”

[英]KeyError: "None of [Int64Index dtype='int64', length=9313)] are in the [columns]"

have a dataframe of 323 column and 10348 row .有一个 323 列和 10348 行的数据框。 i want to divide it using stratified k-Fold using the following code我想使用以下代码使用分层 k-Fold 划分它

df= pd.read_csv("path")
 x=df.loc[:, ~df.columns.isin(['flag'])]
 y= df['flag']
StratifiedKFold(n_splits=5, random_state=None, shuffle=False)
for train_index, test_index in skf.split(x, y):
       print("TRAIN:", train_index, "TEST:", test_index)
       x_train, x_test = x[train_index], x[test_index]
       y_train, y_test = y[train_index], y[test_index]

but i get the following error但我收到以下错误

KeyError: "None of [Int64Index([    0,     1,     2,     3,     4,     5,     6,     7,     8,\n               10,\n            ...\n            10338, 10339, 10340, 10341, 10342, 10343, 10344, 10345, 10346,\n            10347],\n           dtype='int64', length=9313)] are in the [columns]"

any one tell me why i get this error and how to fix it任何人告诉我为什么我会收到这个错误以及如何解决它

Seems like you have a data frame slicing issue rather than something wrong with StratifiedKFold itself.似乎您有数据帧切片问题,而不是 StratifiedKFold 本身有问题。 I crafted a df for that purpose and solved it using iloc to slice an array of indexes here:我为此目的制作了一个 df 并使用iloc在此处对索引数组进行切片来解决它:

from sklearn import model_selection

# The list of some column names in flag
flag = ["raw_sentence", "score"]
x=df.loc[:, ~df.columns.isin(flag)].copy()
y= df[flag].copy()
skf =model_selection.StratifiedKFold(n_splits=2, random_state=None, shuffle=False)
for train_index, test_index in skf.split(x, y):
    print("TRAIN:", train_index, "TEST:", test_index)
    x_train, x_test = x.iloc[list(train_index)], x.iloc[list(test_index)]

And train_indexes and test_indexes being nd-arrays kinda messes the work here, i convert them to the lists.而且 train_indexes 和 test_indexes 是 nd-arrays 有点混乱这里的工作,我将它们转换为列表。

you may refer: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html你可以参考: https : //pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html

you can also use df.take(indices_list,axis=0)你也可以使用df.take(indices_list,axis=0)

x_train, x_test = x.take(list(train_index),axis=0), x.take(list(test_index),axis=0)

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.take.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.take.html

Try changing pandas dataframe to numpy array as follow:尝试将 Pandas 数据框更改为 numpy 数组,如下所示:

pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy()

array([[1, 3],
       [2, 4]])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Receiving KeyError: “[Int64Index([ ... dtype='int64', length=1323)] 都不在 [columns]” - Receiving KeyError: "None of [Int64Index([ ... dtype='int64', length=1323)] are in the [columns]" Python Mlens Ensemble:KeyError:“[Int64Index([... dtype='int64', length=105)] 均不在 [columns] 中” - Python Mlens Ensemble: KeyError: "None of [Int64Index([... dtype='int64', length=105)] are in the [columns]" 读取 CSV & Columns - KeyError: “[Int64Index([0, 1, 2, 3], dtype='int64')] 都在 [columns] 中” - Reading CSV & Columns - KeyError: “None of [Int64Index([0, 1, 2, 3], dtype='int64')] are in the [columns]” KeyError: "[Int64Index([ 12313,\\n, 34534],\\n dtype='int64', leng - KeyError: "None of [Int64Index([ 12313,\n , 34534],\n dtype='int64', leng KeyError:“[Int64Index([112, 113,..121,\n.\n 58, 559],\n dtype='int64', length=448)] 都不在 [列] 中” - KeyError: "None of [Int64Index([112, 113,..121,\n .\n 58, 559],\n dtype='int64', length=448)] are in the [columns]" 关键错误:[Int64Index…] dtype='int64] 均不在 [columns] 中 - Key Error: None of [Int64Index…] dtype='int64] are in the [columns] 关键错误:[Int64Index([…]dtype='int64')] 均不在 [columns] 中 - Key Error: None of [Int64Index([…]dtype='int64')] are in the [columns] Sklearn 错误:[Int64Index([2, 3], dtype='int64')] 均不在 [columns] 中 - Sklearn error: None of [Int64Index([2, 3], dtype='int64')] are in the [columns] [Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='index')] 中没有一个在 [index] - None of [Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='index')] are in the [index] KeyError:使用 drop_duplicates 时的 Int64Index([1], dtype='int64') - KeyError: Int64Index([1], dtype='int64') when using drop_duplicates
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM