简体   繁体   English

我可以使用 DataFrame.to_csv 和 pandas.read_csv 来一致地读写浮点类型吗?

[英]Can I use DataFrame.to_csv and pandas.read_csv to consistently write and read type float?

I'd like to write float values to a CSV file using DataFrame.to_csv and ensure that upon reading it back with pandas.read_csv, I get the exact same in-memory value.我想使用 DataFrame.to_csv 将浮点值写入 CSV 文件,并确保在使用 pandas.read_csv 读回它时,我得到完全相同的内存值。 The text representation doesn't have to make sense to a person reading.文本表示不必对阅读的人有意义。

Are there common textual representations of Python float values? Python 浮点值是否有常见的文本表示? Or a reliable way to deserialize and serialize float to text?还是一种将浮点数反序列化和序列化为文本的可靠方法?

float_format doesn't guarantee read-write reliability float_format 不保证读写可靠性

Yes and no.是和不是。 If your floats are in float64 format, then it doesn't make a difference;如果您的浮点数是 float64 格式,那么它没有任何区别; this is the default float type for pandas.这是 pandas 的默认浮点类型。 If you're saving any other float type (such as float32 or float16), you risk losing it unless you know the type in advance and can pass that to read_csv .如果您要保存任何其他浮点类型(例如 float32 或 float16),则可能会丢失它,除非您事先知道该类型并将其传递给read_csv

df = pd.DataFrame(np.random.randn(5, 2), dtype=np.float16)
df.to_csv('data.csv', index=False)

pd.read_csv('data.csv').dtypes
0    float64  # this should be float16, right?
1    float64
dtype: object

pd.read_csv('data.csv', dtype=pd.np.float16).dtypes # need dtype=... here
0    float16
1    float16
dtype: object

OTOH, pickling your data is a much better option if you intend to preserve the data, it is also more compact and should be a bit faster (not timed). OTOH,如果您打算保留数据,那么酸洗您的数据是一个更好的选择,它也更紧凑并且应该更快(不计时)。

df.to_pickle('data.pkl')

pd.read_pickle('data.pkl').dtypes
0    float16
1    float16
dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM