[英]Numpy genfromtxt - column names
I am trying to import a simple tab separated text file using genfromtxt. 我试图使用genfromtxt导入一个简单的制表符分隔文本文件。 I need to have access to each column header name, along with the data in the column associated with that name. 我需要访问每个列标题名称,以及与该名称关联的列中的数据。 Currently I am accomplishing this in a way that seems kind odd. 目前我正以一种看起来有点奇怪的方式实现这一目标。 All values in the txt file, including the header, are decimal numbers. txt文件中的所有值(包括标题)都是十进制数。
sample input file:
1 2 3 4 # header row
1.2 5.3 2.8 9.5
3.1 4.5 1.1 6.7
1.2 5.3 2.8 9.5
3.1 4.5 1.1 6.7
1.2 5.3 2.8 9.5
3.1 4.5 1.1 6.7
table_data = np.genfromtxt(file_path) #import file as numpy array
header_values = table_data[0,:] # grab first row
table_values = np.delete(table_data,0,0) # grab everything else
I know there must be a more proper way to import a text file of data. 我知道必须有一种更合适的方法来导入数据的文本文件。 I need to make it easy to access each column's header and the respective data pertaining to that header value. 我需要轻松访问每个列的标题以及与该标题值相关的相应数据。 I appreciate any help you can provide. 我感谢您提供的任何帮助。
Clarification: 澄清:
I want to be able to access a column of data by using something along the lines of table_values[header_of_first_column]. 我希望能够通过使用table_values [header_of_first_column]行中的内容来访问数据列。 How would I accomplish this? 我怎么做到这一点?
Use the names parameter to use the first valid line as column names: 使用names参数将第一个有效行用作列名:
data = np.genfromtxt(
fname,
names = True, # If `names` is True, the field names are read from the first valid line
comments = '#', # Skip characters after #
delimiter = '\t', # tab separated values
dtype = None) # guess the dtype of each column
For example, if I modify the data you posted to be truly tab-separated, then the following code works: 例如,如果我将您发布的数据修改为真正以制表符分隔,则以下代码可以正常工作:
import numpy as np
import os
fname = os.path.expanduser('~/test/data')
data = np.genfromtxt(
fname,
names = True, # If `names` is True, the field names are read from the first valid line
comments = '#', # Skip characters after #
delimiter = '\t', # tab separated values
dtype = None) # guess the dtype of each column
print(data)
# [(1.2, 5.3, 2.8, 9.5) (3.1, 4.5, 1.1, 6.7) (1.2, 5.3, 2.8, 9.5)
# (3.1, 4.5, 1.1, 6.7) (1.2, 5.3, 2.8, 9.5) (3.1, 4.5, 1.1, 6.7)]
print(data['1'])
# [ 1.2 3.1 1.2 3.1 1.2 3.1]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.