I am reading data from a file, like so:
f = open('some/file/path')
data = f.read().split('\n')
Which gives me something like data = ['1 a #', '3 e &']
if the original file was
1 a #
3 e &
I need it in a form like
[['1','a','#'],['3','e','&']]
so that I can then do a np.swapaxes()
on it and turn it into
[['1','3'],['a','e'],['#','&']]
But whenever I do do that, the swapaxes
call fails, and it is because I am not ending up with a numpy array of the right shape. To turn the strings into lists of strings, I do:
for n in range(len(data)): data[n] = data[n].split()
data = np.array(data)
But when i check the shape:
np.shape(data)
>>>(2,)
So I cannot swap axes. I've tried doing the numpy
array in a few different ways but everything seems to create a numpy
array that doesn't know there is another dimension inside of the arrays within the array.
To turn data = ['1 a #', '3 e &']
into [['1','a','#'],['3','e','&']]
you should do:
>>> data2 = []
>>> for line in data:
data2.append(line.split())
>>> data2
[['1', 'a', '#'], ['3', 'e', '&']]
split the strings first:
import numpy as np
data = ['1 a #', '3 e &']
np.array([x.split() for x in data]).T
Your line split looks fine
In [110]: data = ['1 a #', '3 e &']
In [111]: for n in range(len(data)): data[n] = data[n].split()
In [112]: data
Out[112]: [['1', 'a', '#'], ['3', 'e', '&']]
In [113]: A=np.array(data)
In [114]: A
Out[114]:
array([['1', 'a', '#'],
['3', 'e', '&']],
dtype='<U1')
In [115]: A.shape
Out[115]: (2, 3)
In [116]: A.T
Out[116]:
array([['1', '3'],
['a', 'e'],
['#', '&']],
dtype='<U1')
In [117]: A.T.tolist()
Out[117]: [['1', '3'], ['a', 'e'], ['#', '&']]
I can 'transpose' a list of lists with zip
as well:
In [119]: list(zip(*data))
Out[119]: [('1', '3'), ('a', 'e'), ('#', '&')]
The original list spliting can also be done with a list comprehension
In [120]: [i.split() for i in ['1 a #', '3 e &']]
Out[120]: [['1', 'a', '#'], ['3', 'e', '&']]
You could have combined the file read and splits with something like
[i.strip().split() for i in f.readlines()]
readlines
returns a list of lines, but they still include the \\n
, which strip
removes. The other thing to watch out for is blank lines between the data lines
===================
In case it wasn't clear,
In [122]: data = ['1 a #', '3 e &']
In [123]: np.array(data)
Out[123]:
array(['1 a #', '3 e &'],
dtype='<U5')
produces a 2 element array, where each element is a 5 character string. No amount of reshaping or transposing will convert this into an array of single element strings. You can reshape it into other 2 element arrays
In [124]: _.reshape(2,1)
Out[124]:
array([['1 a #'],
['3 e &']],
dtype='<U5')
In [125]: __.reshape(1,2,1)
Out[125]:
array([[['1 a #'],
['3 e &']]],
dtype='<U5')
I could view
it as a single character array:
In [128]: A.view('<U1')
Out[128]:
array(['1', ' ', 'a', ' ', '#', '3', ' ', 'e', ' ', '&'],
dtype='<U1')
In [129]: A.view('<U1').reshape(5,2)
Out[129]:
array([['1', ' '],
['a', ' '],
['#', '3'],
[' ', 'e'],
[' ', '&']],
dtype='<U1')
but those blank characters get in the way.
There is also a library that applies string functions to arrays:
np.concatenate(np.char.split(A)).reshape(2,3)
读取文件(strip()将删除'\\ n'): filename="some/file/path" data=[i.strip().split(' ') for i in open(filename)] print(data)
将列表转换为numpy数组并交换轴: import numpy as np print(np.asarray(data)) print(np.asarray(data).T)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.