简体   繁体   English

Python 索引错误 - 轴 0 超出范围

[英]Python Index Error - Out of Bounds for axis 0

I have dataset like the following in the txt file.我在 txt 文件中有如下数据集。 (First column is=userid, second column is=locationid) Normally my dataset is big but I created a dummy dataset to better explain my problem. (第一列=userid,第二列=locationid)通常我的数据集很大,但我创建了一个虚拟数据集来更好地解释我的问题。

I'm trying to create a matrix like in the code below.我正在尝试创建一个矩阵,如下面的代码所示。 row will be userid column location id.行将是用户 ID 列位置 ID。 Since this dataset shows the location ids visited by the users, I assign the value 1 in the code to the locations they visited in the matrix.由于该数据集显示了用户访问的位置 ID,因此我将代码中的值 1 分配给他们在矩阵中访问的位置。

I am getting an indexerror.我收到一个索引错误。 IndexError: index 801 is out of bounds for axis 0 with size 50

I tried different user_num and poi_num but still doesn't work我尝试了不同的 user_num 和 poi_num 但仍然不起作用

datausers.txt数据用户.txt

801 32332
801 14470
801 33847
501 10259
501 34041
501 10201
301 15810
301 34827
301 19264
401 34834
401 35407
401 36115

Code代码

import numpy as np
from collections import defaultdict
from itertools import islice
import pandas as pd 

train_file = "datausers.txt"
user_num = 20
poi_num = 20

training_matrix = np.zeros((user_num, poi_num))
train_data = list(islice(open(train_file, 'r'), 10))

for eachline in train_data:
    uid, lid= eachline.strip().split()
    uid, lid = int(uid), int(lid)
    training_matrix[uid, lid] = 1.0

Error错误

在此处输入图像描述

Expected Output预计 Output

4x12 Matrix because we have 4 unique users and 12 unique location 4x12 矩阵,因为我们有 4 个唯一用户和 12 个唯一位置

[1 0 1 0 1 0 0 0 0 0 0 0
 0 1 0 1 0 1 0 0 0 0 0 0
...
]

For example for first row 1 0 1 0 1 0 0 0 0 0 0 0例如对于第一行 1 0 1 0 1 0 0 0 0 0 0 0

User 801 visited 3 locations and those are 1. (The location of the 1's can be variable I gave it to be an example)用户 801 访问了 3 个位置,它们是 1。(1 的位置可以是可变的,我以它为例)

As you have tagged the question with pandas , here is one way of approaching the problem with str.get_dummies method of the pandas Series :正如您使用pandas标记问题一样,这是使用 pandas Seriesstr.get_dummies方法解决问题的一种方法:

df = pd.read_csv('datausers.txt', sep='\s+', names=['userid', 'locationid'], index_col=0)
out = df['locationid'].astype(str).str.get_dummies().sum(level=0)

Result结果

For the sample data对于样本数据

>>> out
        10201  10259  14470  15810  19264  32332  33847  34041  34827  34834  35407  36115
userid                                                                                    
801         0      0      1      0      0      1      1      0      0      0      0      0
501         1      1      0      0      0      0      0      1      0      0      0      0
301         0      0      0      1      1      0      0      0      1      0      0      0
401         0      0      0      0      0      0      0      0      0      1      1      1

If you need numpy array instead:如果您需要numpy阵列代替:

>>> out.to_numpy()

array([[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python 错误:索引 8 超出尺寸为 8 的轴 0 的范围 - Python error: index 8 is out of bounds for axis 0 with size 8 python 中的错误:索引 0 超出尺寸为 0 的轴 0 的范围 - error in python : index 0 is out of bounds for axis 0 with size 0 Python - 切片错误:IndexError:索引 3 超出了轴 2 大小为 3 的范围 - Python - Slicing error: IndexError: index 3 is out of bounds for axis 2 with size 3 Python 3错误:“ IndexError:索引140超出了轴1的大小100的范围” - Python 3 error: “IndexError: index 140 is out of bounds for axis 1 with size 100” 需要帮助修复 Python 中的错误:轴 0 的索引超出范围 - Need Help in Fixing Error in Python: index out of bounds for axis 0 Python 中的“索引 0 超出轴 0 尺寸 0”错误的保护 - Protection against “index 0 is out of bounds for axis 0 with size 0” error in Python python append error index 1超出了大小为1的轴0的范围 - python append error index 1 is out of bounds for axis 0 with size 1 索引错误:索引3超出了尺寸为3的轴1的范围 - Index Error: index 3 is out of bounds for axis 1 with size 3 索引错误:索引 2 超出轴 0 的范围,大小为 2 - Index Error : Index 2 is out of bounds for axis 0 with size 2 python“IndexError:索引8超出了轴0大小为8的范围” - python "IndexError: index 8 is out of bounds for axis 0 with size 8"
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM