[英]In python2.7.11, why can't I remove the fileopen code?
保存数据的.txt
文件如下(来源: 此处第2章中的“ datingTestSet2.txt”):
40920 8.326976 0.953952 largeDoses
14488 7.153469 1.673904 smallDoses
26052 1.441871 0.805124 didntLike
75136 13.147394 0.428964 didntLike
38344 1.669788 0.134296 didntLike
...
码:
from numpy import *
import operator
from os import listdir
def file2matrix(filename):
fr = open(filename)
# arr = fr.readlines() # Code1!!!!!!!!!!!!!!!!!!!
numberOfLines = len(fr.readlines()) #get the number of lines in the file
returnMat = zeros((numberOfLines,3)) #prepare matrix to return
classLabelVector = [] #prepare labels return
fr = open(filename) # Code2!!!!!!!!!!!!!!!!!!!!!
index = 0
for line in fr.readlines():
line = line.strip()
listFromLine = line.split('\t')
returnMat[index,:] = listFromLine[0:3]
classLabelVector.append(int(listFromLine[-1]))
index += 1
return returnMat,classLabelVector
datingDataMat, datingLabels = file2matrix('datingTestSet2.txt')
该功能的结果是:
datingDataMat datingLabels
40920 8.326976 0.953952 3
14488 7.153469 1.673904 2
26052 1.441871 0.805124 1
75136 13.147394 0.428964 1
38344 1.669788 0.134296 1
72993 10.141740 1.032955 1
35948 6.830792 1.213192 3
42666 13.276369 0.543880 3
67497 8.631577 0.749278 1
35483 12.273169 1.508053 3
50242 3.723498 0.831917 1
... ... ... ...
我的问题是:
当我只删除Code2( fr = open(filename)
,它在index = 0
之上)时,该函数的结果变为全零矩阵和全零向量。 为什么我不能删除Code2? 第一行( fr = open(filename)
不起作用吗?
当我只添加Code1( arr = fr.readlines()
)时,这是错误的。 为什么???
returnMat[index,:] = listFromLine[0:3] IndexError: index 0 is out of bounds for axis 0 with size 0
1)由于此行,您无法删除Code2行:
numberOfLines = len(fr.readlines()) #get the number of lines in the file
在那一行中,您正在读取文件的末尾。 再次打开它将使您进入文件的开头。
2)与上面的答案类似,如果您调用readLines()来读取所有行并将文件光标移至文件末尾...因此,如果您随后尝试再次读取文件上的行,则有没有什么可以阅读的,因此失败了。
您位于文件末尾。 因此,您第二次尝试读取文件内容不会产生任何结果。 您需要返回到文件的开头。 采用:
fr.seek(0)
代替您的:
fr = open(filename) # Code2!!!!!!!!!!!!!!!!!!!!!
你只需要readlines
一次。
def file2matrix(filename):
fr = open(filename)
lines = fr.readlines()
fr.close()
numberOfLines = len(lines) #get the number of lines in the file
returnMat = zeros((numberOfLines,3)) #prepare matrix to return
classLabelVector = [] #prepare labels return
index = 0
for line in lines:
line = line.strip()
listFromLine = line.split('\t')
returnMat[index,:] = listFromLine[0:3]
# careful here, returnMat is initialed as floats
# listFromLine is list of strings
classLabelVector.append(int(listFromLine[-1]))
index += 1
return returnMat,classLabelVector
我可以提出其他一些建议:
def file2matrix(filename):
with open(filename) as f:
lines = f.readlines()
returnList = []
classLabelList = []
for line in lines:
listFromLine = line.strip().split('\t')
returnList.append(listFromLine[0:3])
classLabelList.append(int(listFromLine[-1]))
returnMat = np.array(returnList, dtype=float)
return returnMat, classLabelList
甚至
def file2matrix(filename):
with open(filename) as f:
lines = f.readlines()
ll = [line.strip().split('\t')]
returnMat = np.array([l[0:3] for l in ll], dtype=float)
classLabelList = [int(l[-1]) for l in ll]
# classLabelVec = np.array([l[-1] for l in ll], dtype=int)
return returnMat, classLabelList
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.