简体   繁体   English

如何将pandas DataFrame中的列从str(科学计数法)转换为numpy.float64?

[英]How do I convert a column from a pandas DataFrame from str (scientific notation) to numpy.float64?

I'm trying to read this tab-delimited file into pandas with one caveat: the last column (mean), must be converted from a string representing a value in scientific notation to a numpy.float64. 我正在尝试将此制表符分隔的文件读为带有一个警告的熊猫:最后一列(均值)必须从代表科学计数法的值的字符串转换为numpy.float64。

So far, I've tried 到目前为止,我已经尝试过

df = pd.DataFrame(pd.io.parsers.read_table(fle, converters={'mean': lambda x: np.float64(x)}))

but all I get in df['mean'] is 0 and -0 . 但是我在df ['mean']中得到的都是0-0

I've also tried importing without the converters kwarg, and later casting the column by doing df['mean'].astype(np.float64) , with similar results. 我还尝试了在不使用converters kwarg的情况下进行导入,然后通过执行df['mean'].astype(np.float64)投射列,结果相似。

What gives? 是什么赋予了?

They are not zero. 它们不为零。 pandas probably does some formatting while printing DataFrame/Series so they look like zero. pandas在打印DataFrame/Series时可能会进行一些格式化,因此它们看起来像零。

By the way, you don't need converters. 顺便说一句,您不需要转换器。 read_table correctly identifies them as float64 : read_table正确地将它们标识为float64

In [117]: df = pandas.read_table('gradStat_mmn.tdf')

In [118]: df.ix[0:10]
Out[118]:
    Subject Group Local Global  Attn  mean
0         1  DSub     S      S  Attn     0
1         1  DSub     S      S  Dist     0
2         1  DSub     D      S  Attn     0
3         1  DSub     D      S  Dist     0
4         1  DSub     S      D  Attn     0
5         1  DSub     S      D  Dist     0
6         1  DSub     D      D  Attn     0
7         1  DSub     D      D  Dist     0
8         2  ASub     S      S  Attn     0
9         2  ASub     S      S  Dist     0
10        2  ASub     D      S  Attn     0

In [119]: df['mean'].dtype
Out[119]: dtype('float64')

In [120]: df['mean'][0]
Out[120]: 3.2529000000000002e-22

This has been fixed with version 0.9 of pandas: 这已在0.9版本的熊猫中修复:

In [4]: df = pandas.read_table('http://dl.dropbox.com/u/6160029/gradStat_mmn.tdf')

In [5]: df.head()
Out[5]: 
   Subject Group Local Global  Attn          mean
0        1  DSub     S      S  Attn  3.252900e-22
1        1  DSub     S      S  Dist  6.010100e-22
2        1  DSub     D      S  Attn  4.215700e-22
3        1  DSub     D      S  Dist  8.308100e-22
4        1  DSub     S      D  Attn  2.983500e-22

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在str(numpy.float64)上设置精度? - How to set the precision on str(numpy.float64)? 从同时包含string和float的pandas列中删除科学计数法 - Remove scientific notation from pandas column containing both string and float 将float64 numpy数组转换为非科学记数法的浮点数 - Convert float64 numpy array to floats not in scientific notation 如何解决错误“TypeError: can only concatenate str (not "numpy.float64") to str" 试图 output 相关性 - How can I solve the error "TypeError: can only concatenate str (not "numpy.float64") to str" trying to output the correlation 熊猫中'float64'列类型的总和返回float而不是numpy.float64 - sum of 'float64' column type in pandas return float instead of numpy.float64 熊猫将列从str转换为float - Pandas convert column from str to float 将numpy.float64转换为整数 - Convert numpy.float64 to integer 你如何制作 numpy.float64 的列表? - How do you make a list of numpy.float64? Pandas 错误:将一列读取为 python 值(浮点/整数值),另一列读取为 numpy.float64 - Pandas Error: Reading one column as python Values (Float / Int Values) and other column as numpy.float64 如何在scikit学习的MLPClassifier中计算分数。 获取numpy.float64是不可迭代的 - How do I calculate a score in a scikit-learn MLPClassifier. Getting numpy.float64 is not iterable
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM