简体   繁体   中英

Importing data from csv where each list is split into many rows

Hi so I'm a bit stuck with this problem. I've got a csv file, which looks something like this:

[12  34 45 22 3 5
 34 33 2 67 5 55
 2 90 88 12 34]
[245  4 13]
[33 90 50 22 90 1
 23 44 876  10 7] ...

And so on. In other words, the csv file is split into lists of numbers separated either by a single space or double spaces and if the list of numbers exceeds a certain number of values (14 in my case), it continues the list on the next line until the list of numbers end. The lists of numbers are not separated by commas, but each new list begins and ends with the square brackets.

I want to import the csv file into a list of lists, which would look like this:

[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], 
[245, 4, 13], 
[33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7], 
[...]]

How could I achieve this? I've tried np.loadtxt and pandas, but both treat every line as its own observation.

Thanks in advance!

Edit: The numbers are actually separated either by a single space or double spaces.

The following should work:

with open('myfile.csv') as f:
    t=f.read()
t=t.replace('\n', '').replace('  ', ' ').replace(' ', ',')
l=t.split(']')
l.pop()
l=[i.replace('[', '') for i in l] 
result=[[int(s) for s in k.split(',')] for k in l]
print(result)

Output:

[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]

You can use the built in csv library and then just split the values per row:

import csv

with open('test.csv', 'r') as testCsvFile:
    testCsv = csv.reader(testCsvFile)
    listOfLists = []
    for row in testCsv:
        listOfLists.append([int(val) for val in row[0][1:-1].split(' ')])
    print(listOfLists)


# Output
# [[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]

Edit: Updated parsing to convert the values to int s

Is this what you are looking for:

>>> with open("file.txt", "r") as f:
...     content = f.read().replace("\n", "")
... 
>>> content = [[int(i) for i in c.split(" ")] for c in content[1:-1].split("][")]
>>> content
[[12, 34, 45, 22, 3, 5, 34, 33, 2, 67, 5, 55, 2, 90, 88, 12, 34], [245, 4, 13], [33, 90, 50, 22, 90, 1, 23, 44, 876, 10, 7]]

First read in entire file as one string, stripping the first and last characters ( [ and ] ) as well as the newline characters ( \\n ). Then split into chunks divided by ][ . Finally split each chunk by the space character and turn them into integers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM