[英]How to read a file and store the data into a list
If I have a file in this format:如果我有这种格式的文件:
1 2 3 4 5
6 7 8 9 10
What is the correct way to read the file and store each digit into a list in Python?在 Python 中读取文件并将每个数字存储到列表中的正确方法是什么?
x_table = []
for eachLine in filename_1:
#Set up temp variable
x_table.append([])
tmpStr = ''
#Loop through each character in the line
for char in eachLine:
#Check whether the char is a number
if char.isdigit():
tmpStr += char
elif char == ' ' and tmpStr != '':
x_table[eachLine].append(int(char))
I got this error:我收到此错误:
type: list indices must be integers, not str.
Read each line and use split()
to separate the digits into separate numbers:阅读每一行并使用
split()
将数字分成单独的数字:
mat = []
for line in open('file.txt').readlines():
mat.append(line.split())
After that, if you want, you can validate if all the lines have the same number of elements.之后,如果需要,您可以验证所有行是否具有相同数量的元素。
eachLine
is a string (more specifically a line in your document) so you can't use that as an index for your x_table
array of arrays. eachLine
是一个字符串(更具体地说是文档中的一行),因此您不能将其用作x_table
数组数组的索引。
You could just keep a running count:你可以保持一个运行计数:
x_table = []
idx = 0
for eachLine in filename_1:
# ...
x_table[idx].append(int(char))
idx += 1
EDIT: or, if you want to go with the approach elias suggested (see below,) you can use a list comprehension to prune the elements that are not numbers:编辑:或者,如果您想采用 elias 建议的方法(见下文),您可以使用列表理解来修剪不是数字的元素:
raw_mat = []
f = open('file.txt')
for line in f.readlines():
raw_mat.append(line.split())
f.close()
mat = []
for row in raw_mat:
mat.append([i for i in row if i.isdigit()])
If you need numbers processing this function will work: http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html Unless you need to alter the array-of-arrays (data is loaded into a numpy.ndarray, which is efficient if you need numbers and math processing).如果您需要处理此函数的数字将起作用: http : //docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html除非您需要更改数组数组(数据被加载到numpy.ndarray,如果您需要数字和数学处理,这很有效)。
Another solution:另一种解决方案:
If you need an array-of-arrays and you don't want to change your core code, note that each element in the for ... in ... is not the index position, but the actual element.如果您需要一个数组数组并且您不想更改您的核心代码,请注意 for ... in ... 中的每个元素不是索引位置,而是实际元素。
to get the index position do: for i, v in enumerate(filename_1):要获取索引位置,请执行以下操作:for i, v in enumerate(filename_1):
And even then, if filename_1 is a string, it's not ok.即便如此,如果 filename_1 是一个字符串,那也不行。 you should specify a file object there (it's iterable, and line by line).
你应该在那里指定一个文件对象(它是可迭代的,并且是一行一行的)。
For each line (eachLine) you could append to x_table the following code:对于每一行 (eachLine),您可以将以下代码附加到 x_table:
x_table.append([int(s) for s in eachLine.split()])
#eachLine.split() will break eachLine by whitespace-strings.
remember to capture exceptions here.记得在这里捕获异常。
full code:完整代码:
x_table = []
for eachLine in open(filename_1, "r"):
x_table.append([int(k) for k in eachLine.split()])
full code for numpy version: numpy 版本的完整代码:
import numpy
x_table = numpy.loadtxt(open(filename_1,"r").read())
remember to capture exceptions in both codes.记住在两个代码中都捕获异常。
x_table = []
for line in filename_1:
numbers = map(int, line.split(' '))
x_table.append(numbers)
Taken care of:已搞定:
You can use regular expression.您可以使用正则表达式。 This example will parse decimals, negatives, and exclude text.
此示例将解析小数、负数并排除文本。
"""Contents of test.txt:
1 2 3.14 4 5 0 text
6 7 -8 9 10 -99.99
some other text
1.0 0.5
"""
import re
filename_1 = open("test.txt", 'r')
values = re.findall(r"-*\d+\.*\d*", filename_1.read())
print values
Note that this will return a list.请注意,这将返回一个列表。 You can then convert the values to int or float.
然后,您可以将值转换为 int 或 float。
Simple:简单的:
file = open("myfile.dat", "r")
matrix = []
for line in file:
matrix.append(line.split()[:])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.