[英]Importing data from csv where each list is split into many rows
Hi so I'm a bit stuck with this problem.嗨,所以我有点被这个问题困住了。 I've got a csv file, which looks something like this:我有一个 csv 文件,它看起来像这样:
[12 34 45 22 3 5
34 33 2 67 5 55
2 90 88 12 34]
[245 4 13]
[33 90 50 22 90 1
23 44 876 10 7] ...
And so on.等等。 In other words, the csv file is split into lists of numbers separated either by a single space or double spaces and if the list of numbers exceeds a certain number of values (14 in my case), it continues the list on the next line until the list of numbers end.换句话说,csv 文件被拆分为由单个空格或双空格分隔的数字列表,如果数字列表超过一定数量的值(在我的情况下为 14),它会继续下一行的列表,直到数字列表结束。 The lists of numbers are not separated by commas, but each new list begins and ends with the square brackets.数字列表不以逗号分隔,但每个新列表都以方括号开头和结尾。
I want to import the csv file into a list of lists, which would look like this:我想将 csv 文件导入到列表列表中,如下所示:
[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34],
[245, 4, 13],
[33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7],
[...]]
How could I achieve this?我怎么能做到这一点? I've tried np.loadtxt and pandas, but both treat every line as its own observation.我尝试过 np.loadtxt 和 pandas,但都将每一行都视为自己的观察。
Thanks in advance!提前致谢!
Edit: The numbers are actually separated either by a single space or double spaces.编辑:数字实际上由一个空格或双空格分隔。
The following should work:以下应该工作:
with open('myfile.csv') as f:
t=f.read()
t=t.replace('\n', '').replace(' ', ' ').replace(' ', ',')
l=t.split(']')
l.pop()
l=[i.replace('[', '') for i in l]
result=[[int(s) for s in k.split(',')] for k in l]
print(result)
Output:输出:
[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]
You can use the built in csv
library and then just split the values per row:您可以使用内置的csv
库,然后只拆分每行的值:
import csv
with open('test.csv', 'r') as testCsvFile:
testCsv = csv.reader(testCsvFile)
listOfLists = []
for row in testCsv:
listOfLists.append([int(val) for val in row[0][1:-1].split(' ')])
print(listOfLists)
# Output
# [[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]
Edit: Updated parsing to convert the values to int
s编辑:更新解析以将值转换为int
s
Is this what you are looking for:这是你想要的:
>>> with open("file.txt", "r") as f:
... content = f.read().replace("\n", "")
...
>>> content = [[int(i) for i in c.split(" ")] for c in content[1:-1].split("][")]
>>> content
[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]
First read in entire file as one string, stripping the first and last characters ( [
and ]
) as well as the newline characters ( \\n
).首先将整个文件作为一个字符串读取,去除第一个和最后一个字符( [
和]
)以及换行符( \\n
)。 Then split into chunks divided by ][
.然后分成由][
划分的块。 Finally split each chunk by the space character and turn them into integers.最后通过空格字符分割每个块并将它们转换为整数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.