[英]pandas or numpy array data elements formatting
Environments: Python 3.7.6 with libs numpy==1.18.2
and pandas==1.0.3
环境:Python 3.7.6 带有库
numpy==1.18.2
和pandas==1.0.3
import numpy as np
import pandas as pd
np.set_printoptions(suppress=True)
pd.set_option('display.float_format', lambda x: '%.2f' % x)
# does not work ?
data = pd.read_csv("test.csv")
"""
# here is test.csv sample data
at,price
1587690840,15.25
1587690900,15.24
1587690960,15.23
---
"""
x = np.asarray(data)
print(x)
"""
# result:
[[1.58769084e+09 1.52500000e+01]
[1.58769090e+09 1.52400000e+01]
[1.58769096e+09 1.52300000e+01]]
"""
I want the 1st element cast as int32 with no scientific notation, and the 2nd element cast as float32 %.2f
.我希望第一个元素转换为 int32 没有科学记数法,第二个元素转换为 float32
%.2f
。
How can I modify the code with the x
result like below:如何使用
x
结果修改代码,如下所示:
[[1587690840 15.25]
[1587690900 15.24]
[1587690960 15.23]]
I don't think that it is possible with the formatter
option of the set_printoptions
method.我认为
set_printoptions
方法的formatter
选项不可能。 Couldn't you do this after with the apply_over_axes
?你不能在
apply_over_axes
之后这样做吗?
Traditional numpy array cannot store multiple types, if you are looking for having multiple dtypes on then please refer to structured arrays传统的 numpy 数组无法存储多种类型,如果您正在寻找具有多种 dtype 的,请参考结构化 arrays
array_f = np.zeros(3, dtype={'names':('integers', 'floats'),
'formats':(np.int32, np.float32)})
array_f['integers'] = x[:,0]
array_f['floats'] = x[:,1]
array_f
# array([(1587690840, 15.25), (1587690900, 15.24), (1587690960, 15.23)],
# dtype=[('integers', '<i4'), ('floats', '<f4')])
But being honest, I think pandas is more capable is these situations.但老实说,我认为 pandas 在这些情况下更有能力。
Your data as structured dtype:您的数据作为结构化数据类型:
In [166]: txt = """at,price
...: 1587690840,15.25
...: 1587690900,15.24
...: 1587690960,15.23"""
In [167]: data = np.genfromtxt(txt.splitlines(), delimiter=',', names=True, dtype=None, encoding=None)
In [168]: data
Out[168]:
array([(1587690840, 15.25), (1587690900, 15.24), (1587690960, 15.23)],
dtype=[('at', '<i8'), ('price', '<f8')])
It has one int field, and one float field.它有一个 int 字段和一个 float 字段。
The same thing loaded as floats与浮动加载相同的东西
In [170]: data = np.genfromtxt(txt.splitlines(), delimiter=',', skip_header=1, encoding=None)
In [171]: data
Out[171]:
array([[1.58769084e+09, 1.52500000e+01],
[1.58769090e+09, 1.52400000e+01],
[1.58769096e+09, 1.52300000e+01]])
I haven't worked set_printoptions
much, but it looks as though suppress=True
does not have an effect with float is this large, (1.58e9).我对
set_printoptions
的工作不多,但看起来suppress=True
对float 这么大(1.58e9)没有影响。 The two columns, displayed separately:分别显示的两列:
In [176]: data[:,0]
Out[176]: array([1.58769084e+09, 1.58769090e+09, 1.58769096e+09])
In [177]: data[:,1]
Out[177]: array([15.25, 15.24, 15.23])
and large floats converted to int:并将大浮点数转换为 int:
In [178]: data[:,0].astype(int)
Out[178]: array([1587690840, 1587690900, 1587690960])
What does your pd.read_csv
produce?你的
pd.read_csv
会产生什么?
In [189]: pd.DataFrame(data, dtype=None)
Out[189]:
0 1
0 1.587691e+09 15.25
1 1.587691e+09 15.24
2 1.587691e+09 15.23
In [190]: pd.DataFrame(Out[168], dtype=None)
Out[190]:
at price
0 1587690840 15.25
1 1587690900 15.24
2 1587690960 15.23
Converting the dataframe back to array:将 dataframe 转换回数组:
In [191]: Out[190].to_numpy()
Out[191]:
array([[1.58769084e+09, 1.52500000e+01],
[1.58769090e+09, 1.52400000e+01],
[1.58769096e+09, 1.52300000e+01]])
In [193]: Out[190].to_records(index=False)
Out[193]:
rec.array([(1587690840, 15.25), (1587690900, 15.24), (1587690960, 15.23)],
dtype=[('at', '<i8'), ('price', '<f8')])
suppress
does have effect if the largest numbers are smaller:如果最大数字较小,则
suppress
确实有效:
In [201]: with np.printoptions(suppress=True):
...: print(data/[100,1])
...:
[[15876908.4 15.25]
[15876909. 15.24]
[15876909.6 15.23]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.