read_table pandas python数字错误

Question

I am doing a basic pd.read_table of a .txt file. 我正在做一个.txt文件的基本pd.read_table 。 The first column is a list of cusips. 第一列是cusips列表。 The cusip "65248E10" is being read as a number 65248E10 = 652480000000000 (E10 as scientific notation). cusip "65248E10"的读数为65248E10 = 652480000000000 （E10为科学记数法）。

I have been going through the pandas but I can't figure out how to require it to stay as a character. 我一直在经历大熊猫，但我无法弄清楚如何要求它作为一个角色。 http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.parsers.read_table.html#pandas.io.parsers.read_table http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.parsers.read_table.html#pandas.io.parsers.read_table

Also, even if I put header = 0, it seems to be putting the first row as the headers and then row 0 is the second row and so on. 此外，即使我把header = 0，它似乎将第一行作为标题，然后第0行是第二行，依此类推。 If my text file has no column names, how can I get that to default to NULL (or 1, 2, 3, etc.) 如果我的文本文件没有列名，我怎么能将它默认为NULL（或1,2,3等）

Thanks for the help. 谢谢您的帮助。 I am new to pandas/python 我是pandas / python的新手

Answer 1

If we have a data file which looks like 如果我们有一个看起来像的数据文件

65248E10 11
55555E55 22

then we can read it in with something like 然后我们可以用类似的东西读它

>>> pd.read_table("cusip.txt", header=None, delimiter=" ", converters={0: str})
          0   1
0  65248E10  11
1  55555E55  22

where we use header=None to tell it that there aren't any headers, we use delimiter=" " to tell it there's a space delimiter (adjust to match your data format), and converters={0: str} to tell it that after reading the first column in as a string, we want to turn it into a string (ie in this case do nothing to it) rather than process it further. 我们使用header=None来告诉它没有任何头文件，我们使用delimiter=" "告诉它有一个空格分隔符（调整以匹配你的数据格式），并且converters={0: str}告诉它在以字符串形式读取第一列之后，我们希望将其转换为字符串（即在这种情况下不执行任何操作），而不是进一步处理它。 Instead of converters={0: str} , dtype=(str, int) would have worked too, but this way we can still let pandas figure out what the other columns are. 而不是converters={0: str} ， dtype=(str, int)也会起作用，但这样我们仍然可以让pandas弄清楚其他列是什么。

The problem with using header=0 is that 0 here doesn't mean "no header", it means use row number #0 (the first row) as the headers. 使用header=0的问题是0这里并不意味着“没有标题”，它意味着使用行号＃0（第一行）作为标题。

Answer 2

To stop your column from being read as a number, use the converters parameter and specify str as the converter for the column containing your "cusips". 要阻止将列读取为数字，请使用converters参数并将str指定为包含“cusips”的列的转换器。

For the header, as documented on the page you linked to, header is the number of the row which is to be considered the header; 对于标题，如您链接到的页面上所记录的那样， header是要被视为标题的行号 ; it is not a boolean saying "do I have a header or not. Setting it to zero means to use row zero (ie, the first row) as the header. The documentation explicitly says: 它不是一个布尔说法“我是否有标题。将其设置为零意味着使用行零（即第一行）作为标题。文档明确说明：

Specify None if there is no header row. 如果没有标题行，请指定None。

read_table pandas python数字错误

问题描述

2 个解决方案

解决方案1
2 已采纳 2012-12-27 19:47:41

解决方案2
1 2012-12-27 19:46:59

read_table pandas python数字错误

问题描述

2 个解决方案

解决方案1 2 已采纳 2012-12-27 19:47:41

解决方案2 1 2012-12-27 19:46:59

解决方案1
2 已采纳 2012-12-27 19:47:41

解决方案2
1 2012-12-27 19:46:59