简体   繁体   English

Octave - dlmread 和 csvread 将第一个值转换为零

[英]Octave - dlmread and csvread convert the first value to zero

When I try to read a csv file in Octave I realize that the very first value from it is converted to zero.当我尝试在 Octave 中读取 csv 文件时,我意识到其中的第一个值被转换为零。 I tried both csvread and dlmread and I'm receiving no errors.我尝试了csvreaddlmread并且没有收到任何错误。 I am able to open the file in a plain text editor and I can see the correct value there.我可以在纯文本编辑器中打开文件,并且可以在那里看到正确的值。 From what I can tell, there are no funny hidden characters, spacings, or similar in the csv file.据我所知,csv 文件中没有有趣的隐藏字符、空格或类似字符。 Files also contain only numbers.文件也只包含数字。 The only thing that I feel might be important is that I have five columns/groups that each have different number of values in them.我觉得唯一可能重要的是我有五个列/组,每个列/组中都有不同数量的值。

I went through the commands' documentation on Octave Forge and I do not know what may be causing this.我浏览了 Octave Forge 上的命令文档,但我不知道是什么原因造成的。 Does anyone have an idea what I can troubleshoot?有谁知道我可以解决什么问题?

To try to illustrate the issue, if I try to load a file with the contents:为了说明问题,如果我尝试加载包含以下内容的文件:

1.1,2.1,3.1,4.1,5.1 
,2.2,3.2,4.2,5.2 
,2.3,3.3,4.3, 
,,3.4,4.4 
,,3.5,

Command window will return:命令 window 将返回:

0.0,2.1,3.1,4.1,5.1 
,2.2,3.2,4.2,5.2 
,2.3,3.3,4.3, 
,,3.4,4.4 
,,3.5,

( with additional trailing zeros after the decimal point). (小数点后有额外的尾随零)。

Command syntaxes I'm using are:我使用的命令语法是:

dt = csvread("FileName.csv")

and

dt = dlmread("FileName.csv",",")

and they both return the same.他们都返回相同的。

Your csv file contains a Byte Order Mark right before the first number.您的 csv 文件在第一个数字之前包含一个字节顺序标记 You can confirm this if you open the file in a hex editor, you will see the sequence EF BB BF before the numbers start.如果您在十六进制编辑器中打开文件,您可以确认这一点,您将在数字开始之前看到序列 EF BB BF。

This causes the first entry to be interpreted as a 'string', and since strings are parsed based on whether there are numbers in 'front' of the string sequence, this is parsed as the number zero.这会导致第一个条目被解释为“字符串”,并且由于字符串是根据字符串序列的“前面”是否有数字来解析的,因此将其解析为数字零。 (see also this answer for more details on how csv entries are parsed). (有关如何解析 csv 条目的更多详细信息,另请参阅此答案)。

In my text editor, if I start at the top left of the file, and press the right arrow key once, you can tell that the cursor hasn't moved (meaning I've just gone over the invisible byte order mark, which takes no visible space).在我的文本编辑器中,如果我从文件的左上角开始,然后按一次右箭头键,您可以知道 cursor 没有移动(这意味着我刚刚越过了不可见的字节顺序标记,这需要没有可见空间)。 Pressing backspace at this point to delete the byte order mark allows the csv to be read properly.此时按退格键删除字节顺序标记可以正确读取 csv。 Alternatively, you may have to fix your file in a hex editor, or find some other way to convert it to a proper Ascii file (or UTF without the byte order mark).或者,您可能必须在十六进制编辑器中修复您的文件,或者找到其他方法将其转换为正确的 Ascii 文件(或不带字节顺序标记的 UTF)。

Also, it may be worth checking how this file was produced;此外,检查这个文件是如何产生的可能是值得的; if you have any control in that process, perhaps you can find why this mark was placed in the first place and prevent it.如果您在该过程中有任何控制权,也许您可以找到为什么首先放置此标记并阻止它。 Eg, if this was exported from Excel, you can choose plain 'csv' format instead of 'utf-8 csv'.例如,如果这是从 Excel 导出的,您可以选择纯“csv”格式而不是“utf-8 csv”。

UPDATE更新

In fact, this issue seems to have already been submitted as a bug and fixed in the development branch of octave.事实上,这个问题似乎已经作为 bug 提交并在 octave 的开发分支中修复。 See #58813 :)#58813 :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM