为什么 pd.read_csv('file.csv') add.999999 在某些值的末尾？

Question

I have a csv file with 4 rows and 1 column.我有一个 4 行 1 列的 csv 文件。 When I open it with Sublime it looks like this:当我用 Sublime 打开它时，它看起来像这样：

2.291433301000000000e+09
3.601532401000000000e+09
3.061400502000000000e+09
3.195901470100000000e+10

When I read it using:当我阅读它时：

df = pd.read_csv('file.csv', names=['Column 1'])

The value at the last row in python is 31959014700.999996 python 最后一行的值为 31959014700.999996

How can I solve the issue?我该如何解决这个问题？ I tried adding data type when reading the file:我在读取文件时尝试添加数据类型：

df = pd.read_csv('file.csv', names=['Column 1'], dtype=np.int64)

But didn't work.但是没有用。 I also tried:我也试过：

df = pd.read_csv('file.csv', names=['Column 1'])
df = df(pd.to_numeric, errors='coerce')

But it says it cannot convert to int!但它说它不能转换为int！

Thanks for your help.谢谢你的帮助。

Answer 1

That is a floating point error, and Pandas is refusing to convert because it won't round floats automatically.这是一个浮点错误，Pandas 拒绝转换，因为它不会自动舍入浮点数。

Try this:尝试这个：

df = pd.read_csv(
    'file.csv', names=['Column 1']
).round(0).astype(int)

If you are writing data into the CSV as well and only plan to store integers, you might not want to use scientific notation.如果您也将数据写入 CSV 并且只打算存储整数，您可能不想使用科学计数法。 Numbers in scientific notation will be interpreted as floating point, so you will have to do this to represent them as integers without occasional failure.科学计数法中的数字将被解释为浮点数，因此您必须这样做才能将它们表示为整数，而不会偶尔失败。

为什么 pd.read_csv('file.csv') add.999999 在某些值的末尾？

问题描述

1 个解决方案

解决方案1
1 2021-04-09 03:16:39

为什么 pd.read_csv('file.csv') add.999999 在某些值的末尾？

问题描述

1 个解决方案

解决方案1 1 2021-04-09 03:16:39

解决方案1
1 2021-04-09 03:16:39