[英]read_table pandas python numeric error
I am doing a basic pd.read_table
of a .txt
file. 我正在做一个.txt
文件的基本pd.read_table
。 The first column is a list of cusips. 第一列是cusips列表。 The cusip "65248E10"
is being read as a number 65248E10 = 652480000000000
(E10 as scientific notation). cusip "65248E10"
的读数为65248E10 = 652480000000000
(E10为科学记数法)。
I have been going through the pandas but I can't figure out how to require it to stay as a character. 我一直在经历大熊猫,但我无法弄清楚如何要求它作为一个角色。 http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.parsers.read_table.html#pandas.io.parsers.read_table http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.parsers.read_table.html#pandas.io.parsers.read_table
Also, even if I put header = 0, it seems to be putting the first row as the headers and then row 0 is the second row and so on. 此外,即使我把header = 0,它似乎将第一行作为标题,然后第0行是第二行,依此类推。 If my text file has no column names, how can I get that to default to NULL (or 1, 2, 3, etc.) 如果我的文本文件没有列名,我怎么能将它默认为NULL(或1,2,3等)
Thanks for the help. 谢谢您的帮助。 I am new to pandas/python 我是pandas / python的新手
If we have a data file which looks like 如果我们有一个看起来像的数据文件
65248E10 11
55555E55 22
then we can read it in with something like 然后我们可以用类似的东西读它
>>> pd.read_table("cusip.txt", header=None, delimiter=" ", converters={0: str})
0 1
0 65248E10 11
1 55555E55 22
where we use header=None
to tell it that there aren't any headers, we use delimiter=" "
to tell it there's a space delimiter (adjust to match your data format), and converters={0: str}
to tell it that after reading the first column in as a string, we want to turn it into a string (ie in this case do nothing to it) rather than process it further. 我们使用header=None
来告诉它没有任何头文件,我们使用delimiter=" "
告诉它有一个空格分隔符(调整以匹配你的数据格式),并且converters={0: str}
告诉它在以字符串形式读取第一列之后,我们希望将其转换为字符串(即在这种情况下不执行任何操作),而不是进一步处理它。 Instead of converters={0: str}
, dtype=(str, int)
would have worked too, but this way we can still let pandas
figure out what the other columns are. 而不是converters={0: str}
, dtype=(str, int)
也会起作用,但这样我们仍然可以让pandas
弄清楚其他列是什么。
The problem with using header=0
is that 0
here doesn't mean "no header", it means use row number #0 (the first row) as the headers. 使用header=0
的问题是0
这里并不意味着“没有标题”,它意味着使用行号#0(第一行)作为标题。
To stop your column from being read as a number, use the converters
parameter and specify str
as the converter for the column containing your "cusips". 要阻止将列读取为数字,请使用converters
参数并将str
指定为包含“cusips”的列的转换器。
For the header, as documented on the page you linked to, header
is the number of the row which is to be considered the header; 对于标题,如您链接到的页面上所记录的那样, header
是要被视为标题的行号 ; it is not a boolean saying "do I have a header or not. Setting it to zero means to use row zero (ie, the first row) as the header. The documentation explicitly says: 它不是一个布尔说法“我是否有标题。将其设置为零意味着使用行零(即第一行)作为标题。文档明确说明:
Specify None if there is no header row. 如果没有标题行,请指定None。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.