简体   繁体   English

相当于Matlab textscan的Python

[英]Python equivalent of Matlab textscan

I'm working with transferring some Matlab code to Python. 我正在将一些Matlab代码传输到Python。 I'm relatively new to Python and am unsure of a Python equivalent of Matlab's textscan method. 我是Python的新手,不确定与Matlab的textscan方法等效的Python。 Any help would be greatly appreciated. 任何帮助将不胜感激。

If you're translating Matlab to Python, I'll assume you're already using NumPy. 如果要将Matlab转换为Python,我假设您已经在使用NumPy。

In that case, you can use np.loadtxt (if no values are missing) or np.genfromtxt (if there are missing values: I'm not sure whether Matlab's textscan handles that). 在这种情况下,您可以使用np.loadtxt (如果没有值丢失)或np.genfromtxt (如果有值丢失:我不确定Matlab的textscan是否textscan处理该值)。

Give us a few more details if you need more help! 如果需要更多帮助,请给我们提供更多详细信息!

Example of conversion of MATLAB's textscan to Python + NumPy's np.loadtxt : 将MATLAB的textscan转换为Python + NumPy的np.loadtxtnp.loadtxt

Let our data file results.csv contain: 让我们的数据文件results.csv包含:

0.6236,sym2,1,5,10,10
0.6044,sym2,2,5,10,10
0.548,sym2,3,5,10,10
0.6238,sym2,4,5,10,10
0.6411,sym2,5,5,10,10
0.7105,sym2,6,5,10,10
0.6942,sym2,7,5,10,10
0.6625,sym2,8,5,10,10
0.6531,sym2,9,5,10,10

Matlab code: Matlab代码:

fileID = fopen('results.csv');
d = textscan(fileID,'%f %s %d %d %d %d', 'delimiter',',');
fclose(fileID);

Python + NumPy code: Python + NumPy代码:

fd = open('results2.csv','r')    
d = np.loadtxt(fd,
           delimiter=',',
           dtype={'names': ('col1', 'col2', 'col3', 'col4', 'col5', 'col6'),
           'formats': ('float', 'S4', 'i4', 'i4', 'i4', 'i4')})
fd.close()

For more info on types, see Data type objects (dtype) . 有关类型的更多信息,请参见数据类型对象(dtype)

you have to look for Numpy and py2mat. 您必须寻找Numpy和py2mat。 If my understanding of textscan() is correct you could just use open() 如果我对textscan()的理解正确,则可以使用open()

If your results are more complicated than simple delimited text, such as if there are other, useless bits of text mixed in, then you can use Numpy's fromregex function to replace textscan . 如果结果比简单的带分隔符的文本还要复杂(例如,如果有其他无用的文本混入其中),则可以使用Numpy的fromregex函数替换textscan fromregex lets you read in based on a regular expression, with groups (parts surrounded by () ) as the values. fromregex允许您基于正则表达式读入,并使用组(用()包围的部分)作为值。

So for example say you have lines like this: 例如,假设您有这样的行:

field1 is 1, field 2 is 5 to 6.6
field1 is 2, field 2 is 7 to 0.1

And you want to get the value numbers (not the field names): 您想获取值编号(而不是字段名称):

[[1, 5, 6.6],
 [2, 7, 0.1]]

You can do 你可以做

data = np.fromregex('temp.txt', r'field1 is ([\d\.]+), field 2 is ([\d\.]+) to ([\d\.]+)', dtype='float')

The [\\d\\.]+ matches any number, including decimal places, and the () tells numpy to use that result as a value. [\\d\\.]+匹配任何数字,包括小数位,并且()告诉numpy使用该结果作为值。 You can also specify more complicated dtypes, such as having different columns have different types, as well as specifying column names to give a structured array. 您还可以指定更复杂的dtype,例如使不同的列具有不同的类型,以及指定列名以提供结构化数组。 That is covered in the documentation. 在文档中对此进行了介绍。

However, it is more complicated than other approaches when dealing with simple delimited or fixed-width data. 但是,在处理简单的定界或固定宽度数据时,它比其他方法更为复杂。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM