为什么我在编码时总是得到一个 nan 集？

Question

I am trying to convert my csv file into a numpy array so I can manipulate the numbers and then graph them.我正在尝试将我的 csv 文件转换为一个 numpy 数组，以便我可以操作数字然后绘制它们。 I printed my csv file and got:我打印了我的 csv 文件并得到：

               ra              dec
0       15:09:11.8     -34:13:44.9
1       09:19:46.8   +33:44:58.452
2     05:15:43.488   +19:21:46.692
3     04:19:12.096    +55:52:43.32

.... there's more code (101 lines x 2 columns), but it is just numbers. .... 还有更多的代码（101 行 x 2 列），但它只是数字。 I wanted to convert the ra and dec numbers from their current unit to degrees and I thought I could do this by making each column into a numpy array.我想将 ra 和 dec 数字从它们当前的单位转换为度数，我想我可以通过将每一列变成一个 numpy 数组来做到这一点。 But when I coded it:但是当我编码它时：

import numpy as np
np_array = np.genfromtxt(r'C:\Users\nstev\Downloads\S190930t.csv',delimiter=".", skip_header=1, usecols=(4))
print(np_array)

I get:我得到：

nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan]

I keep changing my delimiter and I have changed it to a colon and got the same thing and a semicolon and plus sign and I got an error saying that it got 2 columns instead of 1. I do not know how to change it so that I do not get this set!我一直在改变我的分隔符，我把它改成了冒号，得到了同样的东西，一个分号和加号，我得到一个错误，说它有 2 列而不是 1 列。我不知道如何改变它，以便我不要得到这个集合！ Someone help please!有人请帮忙！

Answer 1

With a copy-n-paste of your file sample:通过复制粘贴您的文件示例：

In [208]: data = np.genfromtxt('stack59761369.csv',encoding=None,dtype=None,names=True)          
In [209]: data                                                                                   
Out[209]: 
array([('15:09:11.8', '-34:13:44.9'), ('09:19:46.8', '+33:44:58.452'),
       ('05:15:43.488', '+19:21:46.692'),
       ('04:19:12.096', '+55:52:43.32')],
      dtype=[('ra', '<U12'), ('dec', '<U13')])

with this dtype and names I get a structured array, 1d, with 2 fields.使用这个 dtype 和名称，我得到一个结构化数组，1d，有 2 个字段。

In [210]: data['ra']                                                                             
Out[210]: 
array(['15:09:11.8', '09:19:46.8', '05:15:43.488', '04:19:12.096'],
      dtype='<U12')
In [211]: np.char.split(data['ra'],':')                                                          
Out[211]: 
array([list(['15', '09', '11.8']), list(['09', '19', '46.8']),
       list(['05', '15', '43.488']), list(['04', '19', '12.096'])],
      dtype=object)

this split gives an object dtype array with lists.这个拆分给出了一个带有列表的对象 dtype 数组。 They can be joined into one 2d array with vstack :它们可以用vstack连接成一个二维数组：

In [212]: np.vstack(np.char.split(data['ra'],':'))                                               
Out[212]: 
array([['15', '09', '11.8'],
       ['09', '19', '46.8'],
       ['05', '15', '43.488'],
       ['04', '19', '12.096']], dtype='<U6')

and converted to floats with:并转换为浮点数：

In [213]: np.vstack(np.char.split(data['ra'],':')).astype(float)                                 
Out[213]: 
array([[15.   ,  9.   , 11.8  ],
       [ 9.   , 19.   , 46.8  ],
       [ 5.   , 15.   , 43.488],
       [ 4.   , 19.   , 12.096]])
In [214]: np.vstack(np.char.split(data['dec'],':')).astype(float)                                
Out[214]: 
array([[-34.   ,  13.   ,  44.9  ],
       [ 33.   ,  44.   ,  58.452],
       [ 19.   ,  21.   ,  46.692],
       [ 55.   ,  52.   ,  43.32 ]])

pandas熊猫

In [256]: df =  pd.read_csv('stack59761369.csv',delim_whitespace=True)                           
In [257]: df                                                                                     
Out[257]: 
             ra            dec
0    15:09:11.8    -34:13:44.9
1    09:19:46.8  +33:44:58.452
2  05:15:43.488  +19:21:46.692
3  04:19:12.096   +55:52:43.32
In [258]: df['ra'].str.split(':',expand=True).astype(float)                                      
Out[258]: 
      0     1       2
0  15.0   9.0  11.800
1   9.0  19.0  46.800
2   5.0  15.0  43.488
3   4.0  19.0  12.096
In [259]: df['dec'].str.split(':',expand=True).astype(float)                                     
Out[259]: 
      0     1       2
0 -34.0  13.0  44.900
1  33.0  44.0  58.452
2  19.0  21.0  46.692
3  55.0  52.0  43.320

direct line read直接读取

In [279]: lines = []                                                                             
In [280]: with open('stack59761369.csv') as f: 
     ...:     header=f.readline() 
     ...:     for row in f: 
     ...:         alist = row.split() 
     ...:         alist = [[float(i) for i in astr.split(':')] for astr in alist] 
     ...:         lines.append(alist) 
     ...:                                                                                        
In [281]: lines                                                                                  
Out[281]: 
[[[15.0, 9.0, 11.8], [-34.0, 13.0, 44.9]],
 [[9.0, 19.0, 46.8], [33.0, 44.0, 58.452]],
 [[5.0, 15.0, 43.488], [19.0, 21.0, 46.692]],
 [[4.0, 19.0, 12.096], [55.0, 52.0, 43.32]]]
In [282]: np.array(lines)                                                                        
Out[282]: 
array([[[ 15.   ,   9.   ,  11.8  ],
        [-34.   ,  13.   ,  44.9  ]],

       [[  9.   ,  19.   ,  46.8  ],
        [ 33.   ,  44.   ,  58.452]],

       [[  5.   ,  15.   ,  43.488],
        [ 19.   ,  21.   ,  46.692]],

       [[  4.   ,  19.   ,  12.096],
        [ 55.   ,  52.   ,  43.32 ]]])
In [283]: _.shape                                                                                
Out[283]: (4, 2, 3)

First dimension is the number of rows;第一个维度是行数； second the 2 columns, third the 3 values in a column第二个是 2 列，第三个是列中的 3 个值

conversion to degree学位转换

In [285]: _282@[1,1/60,1/360]                                                                    
Out[285]: 
array([[ 15.18277778, -33.65861111],
       [  9.44666667,  33.8957    ],
       [  5.3708    ,  19.4797    ],
       [  4.35026667,  55.987     ]])

oops, that -34 deg value is wrong;糟糕，-34 度的值是错误的； all terms of an element have to have the same sign.元素的所有项必须具有相同的符号。

correction更正

Identify the elements with a negative degree:识别度数为负的元素：

In [296]: mask = np.sign(_282[:,:,0])                                                            
In [297]: mask                                                                                   
Out[297]: 
array([[ 1., -1.],
       [ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

adjust all 3 terms accordingly:相应地调整所有 3 个术语：

In [298]: x = np.abs(_282)*mask[:,:,None]                                                        
In [299]: x                                                                                      
Out[299]: 
array([[[ 15.   ,   9.   ,  11.8  ],
        [-34.   , -13.   , -44.9  ]],

       [[  9.   ,  19.   ,  46.8  ],
        [ 33.   ,  44.   ,  58.452]],

       [[  5.   ,  15.   ,  43.488],
        [ 19.   ,  21.   ,  46.692]],

       [[  4.   ,  19.   ,  12.096],
        [ 55.   ,  52.   ,  43.32 ]]])
In [300]: x@[1, 1/60, 1/360]                                                                     
Out[300]: 
array([[ 15.18277778, -34.34138889],
       [  9.44666667,  33.8957    ],
       [  5.3708    ,  19.4797    ],
       [  4.35026667,  55.987     ]])

Answer 2

The nan is probably NaN (Not a Number). nan可能是NaN （不是数字）。 Try setting the data type to None ( dtype=None ).尝试将数据类型设置为 None ( dtype=None )。

Also, try omitting delimiter .另外，尝试省略delimiter 。 By default, any consecutive whitespaces act as delimiter.默认情况下，任何连续的空格都充当分隔符。

Not sure what you're expecting, but maybe this will be a better starting point...不确定你在期待什么，但也许这将是一个更好的起点......

import numpy as np

np_array = np.genfromtxt(r"C:\Users\nstev\Downloads\S190930t.csv", skip_header=1, dtype=None, encoding="utf-8", usecols=(1, 2))
print(np_array)

printed output...打印输出...

[['15:09:11.8' '-34:13:44.9']
 ['09:19:46.8' '+33:44:58.452']
 ['05:15:43.488' '+19:21:46.692']
 ['04:19:12.096' '+55:52:43.32']]

Disclaimer: I don't use numpy.免责声明：我不使用 numpy。 I based my answer on https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html我的回答基于https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html

为什么我在编码时总是得到一个 nan 集？

问题描述

2 个解决方案

解决方案1
1 2020-01-16 01:20:32

pandas熊猫

direct line read直接读取

conversion to degree学位转换

correction更正

解决方案2
0 2020-01-16 00:34:20

为什么我在编码时总是得到一个 nan 集？

问题描述

2 个解决方案

解决方案1 1 2020-01-16 01:20:32

pandas熊猫

direct line read直接读取

conversion to degree学位转换

correction更正

解决方案2 0 2020-01-16 00:34:20

解决方案1
1 2020-01-16 01:20:32

解决方案2
0 2020-01-16 00:34:20