如何将pandas DataFrame中的列从str（科学计数法）转换为numpy.float64？

Question

I'm trying to read this tab-delimited file into pandas with one caveat: the last column (mean), must be converted from a string representing a value in scientific notation to a numpy.float64. 我正在尝试将此制表符分隔的文件读为带有一个警告的熊猫：最后一列（均值）必须从代表科学计数法的值的字符串转换为numpy.float64。

So far, I've tried 到目前为止，我已经尝试过

df = pd.DataFrame(pd.io.parsers.read_table(fle, converters={'mean': lambda x: np.float64(x)}))

but all I get in df['mean'] is 0 and -0 . 但是我在df ['mean']中得到的都是0和-0 。

I've also tried importing without the converters kwarg, and later casting the column by doing df['mean'].astype(np.float64) , with similar results. 我还尝试了在不使用converters kwarg的情况下进行导入，然后通过执行df['mean'].astype(np.float64)投射列，结果相似。

What gives? 是什么赋予了？

Answer 1

They are not zero. 它们不为零。 pandas probably does some formatting while printing DataFrame/Series so they look like zero. pandas在打印DataFrame/Series时可能会进行一些格式化，因此它们看起来像零。

By the way, you don't need converters. 顺便说一句，您不需要转换器。 read_table correctly identifies them as float64 : read_table正确地将它们标识为float64 ：

In [117]: df = pandas.read_table('gradStat_mmn.tdf')

In [118]: df.ix[0:10]
Out[118]:
    Subject Group Local Global  Attn  mean
0         1  DSub     S      S  Attn     0
1         1  DSub     S      S  Dist     0
2         1  DSub     D      S  Attn     0
3         1  DSub     D      S  Dist     0
4         1  DSub     S      D  Attn     0
5         1  DSub     S      D  Dist     0
6         1  DSub     D      D  Attn     0
7         1  DSub     D      D  Dist     0
8         2  ASub     S      S  Attn     0
9         2  ASub     S      S  Dist     0
10        2  ASub     D      S  Attn     0

In [119]: df['mean'].dtype
Out[119]: dtype('float64')

In [120]: df['mean'][0]
Out[120]: 3.2529000000000002e-22

Answer 2

This has been fixed with version 0.9 of pandas: 这已在0.9版本的熊猫中修复：

In [4]: df = pandas.read_table('http://dl.dropbox.com/u/6160029/gradStat_mmn.tdf')

In [5]: df.head()
Out[5]: 
   Subject Group Local Global  Attn          mean
0        1  DSub     S      S  Attn  3.252900e-22
1        1  DSub     S      S  Dist  6.010100e-22
2        1  DSub     D      S  Attn  4.215700e-22
3        1  DSub     D      S  Dist  8.308100e-22
4        1  DSub     S      D  Attn  2.983500e-22

如何将pandas DataFrame中的列从str（科学计数法）转换为numpy.float64？

问题描述

2 个解决方案

解决方案1
3 已采纳 2012-09-14 02:40:04

解决方案2
2 2012-09-20 14:19:21

如何将pandas DataFrame中的列从str（科学计数法）转换为numpy.float64？

问题描述

2 个解决方案

解决方案1 3 已采纳 2012-09-14 02:40:04

解决方案2 2 2012-09-20 14:19:21

解决方案1
3 已采纳 2012-09-14 02:40:04

解决方案2
2 2012-09-20 14:19:21