简体   繁体   English

python numpy文本文件拆分

[英]python numpy text file splitting

I have a text file as formatted below with large number of lines, 我有一个文本文件,格式如下,包含大量行,

[ABC] [text text] [1234]
[DEF] [text text: text] [2345]
....

I want to split the columns into arrays. 我想将列拆分为数组。

names = [ ABC , DEF]
text = [text text, text text: text]
values = [1234, 2345]

I am trying with numpy.genfromtxt but I am not sure how to set the delimiter as there are spaces within the text content. 我正在尝试使用numpy.genfromtxt,但由于文本内容中存在空格,因此我不确定如何设置分隔符。 Is it possible to have a delimiter defined to be '[]' in some way. 是否有可能以某种方式将定界符定义为“ []”。

Here is an example of transposing Columns to Rows. 这是将列转换为行的示例。

>>> import numpy as np
>>> s = "[ABC] [text text] [1234]\n[DEF] [text text: text] [2345]"
>>> lines = s.split('\n')
# this is were the delimiters are split apart --vvv
>>> rows = [line.lstrip('[').rstrip(']').split('] [') for line in lines] # list comprehension
>>> rows
[['ABC', 'text text', '1234'], ['DEF', 'text text: text', '2345']]
>>> np_rows = np.array(rows)
>>> np_rows.T
array([['ABC', 'DEF'],
       ['text text', 'text text: text'],
       ['1234', '2345']],
      dtype='|S15')
>>> np_rows.transpose()
array([['ABC', 'DEF'],
       ['text text', 'text text: text'],
       ['1234', '2345']],
      dtype='|S15')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM