简体   繁体   English

读取混合类型(浮点和字符串)的文件

[英]Read file with mixed types (floats and strings)

I have a data file composed of N columns and M lines which I need to read storing each column into an array/list. 我有一个由N列和M行组成的数据文件,我需要阅读将每一列存储到数组/列表中。 The file is usually filled with numbers (floats) and in those cases I can just do: 该文件通常填充数字(浮点数),在这种情况下,我可以这样做:

import numpy as np
f_data = np.loadtxt('file.dat', unpack=True)

and the result is columns stored in f_data as sublists where its elements are floats , as expected. 结果是按预期将存储在f_data列作为子列表,其中其元素为floats

Other times the file can have random strings scattered around (see an example of such file here ) In those cases I need to read it in the same way (ie: unpacked with each column stored in a list/array and all elements in it stored as float type) with all the strings converted to a default float (for example 99.999 ) 其他时候,文件中可能散布着随机的字符串(请参见此处的示例)在这种情况下,我需要以相同的方式读取文件(即:解压缩存储在列表/数组中的每一列,并存储其中的所有元素)作为float类型),所有字符串都转换为默认float(例如99.999

In the example of the data file above, the column 5 would look like this after reading it: 在上面的数据文件的示例中,第5列在读取后看起来像这样:

f_data[5]
[2.049, 0.946, 0.942, 0.889, 99.999, 0.879, 0.989, 1.142, 1.062, 0.551, 1.233, 0.503]

Notice that all elements are of type float and the string that was found was converted to 99.999 and also stored as a float. 请注意,所有元素均为float类型,并且找到的字符串已转换为99.999并也存储为float。

np.genfromtxt is able to read a file with mixed types but the result is that all the floats are stored as strings, which is not what I need. np.genfromtxt可以读取混合类型的文件,但结果是所有浮点都存储为字符串,这不是我所需要的。

How can I do this? 我怎样才能做到这一点?

np.genfromtxt is the answer, but it's a little bit tricky to get it working just right. np.genfromtxtnp.genfromtxt ,但要使其正常工作有点棘手。

Try: 尝试:

np.genfromtxt("file.txt", dtype=float, filling_values=99.99)

This forces the type to a float, in every case. 在每种情况下,这都将类型强制为浮点型。 When numpy finds a value that isn't a float, it declares this value invalid, and thus missing. 当numpy找到一个不是浮点数的值时,它将声明该值无效,因此丢失。 Filling values gives a default answer for what to do when the data are missing, in your case, 99.99. 填充值给出了丢失数据时的默认答案,在您的情况下为99.99。

And, to edit as requested, to store column-wise, add unpack=True, making the total answer, 并且,要按要求进行编辑,按列存储,请添加unpack = True,得出总答案,

np.genfromtxt("file.txt", dtype=float, filling_values=99.99, unpack=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将文件中的浮点数和字符串读入数据类型完整的元组列表中 - Read floats and strings from a file into a list of tuples with data types intact 从文件到字典以浮点数而不是字符串的形式读取 - Read from file to dictionary as floats instead of strings 使用Python Pandas读取.txt文件-字符串和浮点数 - Read .txt file with Python Pandas - strings and floats 读取混合项目文件并保留其数据类型 - Read a file of mixed items and retain their data types 读取文本文件 Python 中的混合数据类型 - read mixed data types in text file Python 混合浮点数和字符串的不一致 dtype 推断 - Inconsistent dtype inference for mixed floats and strings 使用python中的混合数据类型读取未知大小的二进制文件 - Read binary file of unknown size with mixed data types in python 如何在Python中自动读取具有混合data_types的文本文件? - How to read a text file with mixed data_types automatically in Python? 如何从列表(txt 文件)中读取字符串并将它们打印为整数、字符串和浮点数? - How to get read strings from a list(a txt file) and print them out as ints, strings, and floats? 将混合的dict元素与int,字符串和浮点数进行比较 - Comparing mixed dict elements against int, strings and floats
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM