读取混合类型（浮点和字符串）的文件

Question

I have a data file composed of N columns and M lines which I need to read storing each column into an array/list. 我有一个由N列和M行组成的数据文件，我需要阅读将每一列存储到数组/列表中。 The file is usually filled with numbers (floats) and in those cases I can just do: 该文件通常填充数字（浮点数），在这种情况下，我可以这样做：

import numpy as np
f_data = np.loadtxt('file.dat', unpack=True)

and the result is columns stored in f_data as sublists where its elements are floats , as expected. 结果是按预期将存储在f_data列作为子列表，其中其元素为floats 。

Other times the file can have random strings scattered around (see an example of such file here ) In those cases I need to read it in the same way (ie: unpacked with each column stored in a list/array and all elements in it stored as float type) with all the strings converted to a default float (for example 99.999 ) 其他时候，文件中可能散布着随机的字符串（请参见此处的示例）在这种情况下，我需要以相同的方式读取文件（即：解压缩存储在列表/数组中的每一列，并存储其中的所有元素）作为float类型），所有字符串都转换为默认float（例如99.999 ）

In the example of the data file above, the column 5 would look like this after reading it: 在上面的数据文件的示例中，第5列在读取后看起来像这样：

f_data[5]
[2.049, 0.946, 0.942, 0.889, 99.999, 0.879, 0.989, 1.142, 1.062, 0.551, 1.233, 0.503]

Notice that all elements are of type float and the string that was found was converted to 99.999 and also stored as a float. 请注意，所有元素均为float类型，并且找到的字符串已转换为99.999并也存储为float。

np.genfromtxt is able to read a file with mixed types but the result is that all the floats are stored as strings, which is not what I need. np.genfromtxt可以读取混合类型的文件，但结果是所有浮点都存储为字符串，这不是我所需要的。

How can I do this? 我怎样才能做到这一点？

Answer 1

np.genfromtxt is the answer, but it's a little bit tricky to get it working just right. np.genfromtxt是np.genfromtxt ，但要使其正常工作有点棘手。

Try: 尝试：

np.genfromtxt("file.txt", dtype=float, filling_values=99.99)

This forces the type to a float, in every case. 在每种情况下，这都将类型强制为浮点型。 When numpy finds a value that isn't a float, it declares this value invalid, and thus missing. 当numpy找到一个不是浮点数的值时，它将声明该值无效，因此丢失。 Filling values gives a default answer for what to do when the data are missing, in your case, 99.99. 填充值给出了丢失数据时的默认答案，在您的情况下为99.99。

And, to edit as requested, to store column-wise, add unpack=True, making the total answer, 并且，要按要求进行编辑，按列存储，请添加unpack = True，得出总答案，

np.genfromtxt("file.txt", dtype=float, filling_values=99.99, unpack=True)

读取混合类型（浮点和字符串）的文件

问题描述

1 个解决方案

解决方案1
3 已采纳 2014-02-24 15:22:47

读取混合类型（浮点和字符串）的文件

问题描述

1 个解决方案

解决方案1 3 已采纳 2014-02-24 15:22:47

解决方案1
3 已采纳 2014-02-24 15:22:47