[英]Numpy genfromtxt iterate over columns
I am using NumPy
's genfromtext
to get columns from a CSV file.我正在使用
NumPy
的genfromtext
从 CSV 文件中获取列。
Each column needs to be split and assigned to a separate SQLAlchemy
SystemRecord
combined with some other columns and attributes and added to the DB.每个列都需要拆分并分配给单独的
SQLAlchemy
SystemRecord
并与其他一些列和属性相结合,然后添加到数据库中。
Whats the best practice to iterate over the columns f1
to f9
and add them to the session object?迭代列
f1
到f9
并将它们添加到会话对象的最佳实践是什么?
So far, I have used the following code but I don't want to do the same thing for each f
column:到目前为止,我已经使用了以下代码,但我不想为每个
f
列做同样的事情:
t = np.genfromtxt(FILE_NAME,dtype=[(np.str_, 20),(np.str_, 20),(np.str_, 20),(np.str_, 20),(np.str_, 20),(np.str_, 20),(np.str_, 20),(np.str_, 20), (np.str_, 20), (np.str_, 20),(np.str_, 20)]\
,delimiter=',',filling_values="None", skiprows=0,usecols=(0,1,2,3,4,5,6,7,8,9,10))
for r in enumerate(t):
_acol = r['f1'].split('-')
_bcol = r['f2'].split('-')
....
arec = t_SystemRecords(first=_acol[0], second=_acol[1], third=_acol[2], ... )
db.session.add(arec)
db.session.commit()
Look at t.dtype
.看看
t.dtype
。 Or the r.dtype
.或者
r.dtype
。
Make a sample structured array (which is what genfromtxt returns):制作一个示例结构化数组(这是 genfromtxt 返回的内容):
t = np.ones((5,), dtype='i4,i4,f8,S3')
which looks like:看起来像:
array([(1, 1, 1.0, b'1'), (1, 1, 1.0, b'1'), (1, 1, 1.0, b'1'),
(1, 1, 1.0, b'1'), (1, 1, 1.0, b'1')],
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<f8'), ('f3', 'S3')])
the dtype
and dtype.names
are:在
dtype
和dtype.names
是:
In [135]: t.dtype
Out[135]: dtype([('f0', '<i4'), ('f1', '<i4'), ('f2', '<f8'), ('f3', 'S3')])
In [138]: t.dtype.names
Out[138]: ('f0', 'f1', 'f2', 'f3')
iterate over the names to see the individual columns:遍历名称以查看各个列:
In [139]: for n in t.dtype.names:
.....: print(t[n])
.....:
[1 1 1 1 1]
[1 1 1 1 1]
[ 1. 1. 1. 1. 1.]
[b'1' b'1' b'1' b'1' b'1']
Or in your case, iterate over the 'rows', and then iterate over the names:或者在您的情况下,遍历“行”,然后遍历名称:
In [140]: for i,r in enumerate(t):
.....: print(r)
.....: for n in r.dtype.names:
.....: print(r[n])
.....:
(1, 1, 1.0, b'1')
1
1
1.0
b'1'
(1, 1, 1.0, b'1')
...
For r
, which is 0d (check r.shape
), you can select items by number or iterate对于
r
,即 0d (检查r.shape
),您可以按数字或迭代选择项目
r[1] # == r[r.dtype.names[1]]
for i in r: print(r)
For t
which is 1d this does not work;对于
t
是 1d 这不起作用; t[1]
references an item. t[1]
引用一个项目。
1d structured arrays behave a bit like 2d arrays, but not quite.一维结构化数组的行为有点像二维数组,但又不完全是。 The usual talk of
row
and column
has to be replaced with row
(or item) and field
.通常谈论的
row
和column
必须替换为row
(或 item)和field
。
To make a t
that might be closer to your case做一个可能更接近你的情况的
t
In [175]: txt=[b'one-1, two-23, three-12',b'four-ab, five-ss, six-ss']
In [176]: t=np.genfromtxt(txt,dtype=[(np.str_,20),(np.str_,20),(np.str_,20)])
In [177]: t
Out[177]:
array([('one-1,', 'two-23,', 'three-12'),
('four-ab,', 'five-ss,', 'six-ss')],
dtype=[('f0', '<U20'), ('f1', '<U20'), ('f2', '<U20')])
np.char
has string functions that can be applied to an array: np.char
具有可应用于数组的字符串函数:
In [178]: np.char.split(t['f0'],'-')
Out[178]: array([['one', '1,'], ['four', 'ab,']], dtype=object)
It doesn't work on the structured array, but does work on the individual fields.它不适用于结构化数组,但适用于各个字段。 That output could be indexed as a list of lists (it's not 2d).
该输出可以被索引为列表列表(它不是 2d)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.