[英]How to create a DataFrame instance from array of arrays?
I have created an array which returns (6, 20) as an attribute of the shape
, like this:我创建了一个数组,它返回(6, 20)作为
shape
的属性,如下所示:
import numpy as np
data = np.random.logistic(10, 1, 120)
data = data.reshape(6, 20)
instantiate pandas.DataFrame
from array data
从数组
data
实例化pandas.DataFrame
import pandas as pd
data = pd.DataFrame(data)
now this is a dataframe created using data values that come from the numpy
module's distributive function现在这是使用来自
numpy
模块的分配 function 的数据值创建的 dataframe
and return this:并返回:
0 1 2 3 4 5
0 9.602117 9.507674 9.848685 9.215080 11.061676 9.627753
1 11.702407 9.804924 7.375905 10.784320 8.485818 10.938005
2 9.628927 9.713187 10.027626 10.653311 11.301493 8.756792
3 11.229905 12.013172 10.023200 9.211614 7.139757 9.687851
6 7 8 9 10 11 12
0 9.356069 11.483162 8.993130 8.015089 9.808234 9.435853 9.773375
1 13.422060 10.027434 9.694008 9.677682 10.806266 12.393364 9.479257
2 10.821846 10.690378 8.321566 9.595122 11.753948 10.021815 10.412572
3 8.499120 7.352394 9.288662 9.178306 10.073842 9.246110 9.075350
13 14 15 16 17 18 19
0 9.809366 8.502451 11.624395 12.824338 9.729167 8.945258 10.464157
1 6.698941 9.416421 11.477242 9.622115 6.374589 9.459355 10.435674
2 11.068721 9.775433 9.447799 8.972052 10.692942 10.978305 10.047067
3 10.381596 10.968330 11.892766 12.241880 9.980124 7.321942 9.241030
when I try to set columns=list("abcdef")
, I get this error:当我尝试设置
columns=list("abcdef")
时,出现此错误:
ValueError: Shape of passed values is (6, 20), indices imply (6, 6)
and my expected output is similar to that shown directly from the numpy
array.我预期的 output 与直接从
numpy
数组中显示的相似。 It should contain each column as a pandas.Series
of lists (or list of lists).它应该包含作为
pandas.Series
列表(或列表列表)的每一列。
a.
0 [ 6.98467276 9.16242742 6.99065177 11.50834399 9.29697138 7.93926441
9.05857668 7.13652948 11.01724792 13.31658877 8.63137079 9.5564405
7.37161153 11.19414704 9.45957466 9.19826796 10.13506672 9.74830158
9.97456348 8.35217153]
b.
[10.48249082 11.94030324 12.59080011 10.55695088 12.43071037 11.49568774
10.03540181 11.08708832 10.24655111 8.17904856 11.04791142 7.30069964
8.34783674 9.93743588 8.1537666 9.92773204 10.3416315 9.51624921
9.60124236 11.37511301]
c.
[ 8.21851024 12.71641524 9.7748047 9.51267978 7.92793378 12.1646706
9.67236267 10.22201002 9.67197374 9.70551429 7.79209516 9.20295594
9.26231527 8.04560836 11.0409066 8.63660332 9.18397671 8.17510874
9.61619671 8.42704322]
d.
[14.54825819 16.97573893 7.70643136 12.06334323 14.64054726 9.54619595
10.30686621 12.20487566 10.78492189 12.01011666 10.12405213 8.57057999
10.41665479 7.85921253 10.15572125 9.20554292 10.03832545 9.43720211
11.06605713 9.60298514]
I have found this thread that looks like my problem but it has not helped me much, also I would use the data in a different way.我发现这个线程看起来像我的问题,但它对我没有太大帮助,而且我会以不同的方式使用数据。
Could I assign the lengths of the columns or maybe assign the dimensions of this Pandas.DataFrame
?我可以分配列的长度或者分配这个
Pandas.DataFrame
的尺寸吗?
Your data has 6 rows and 20 columns.您的数据有 6 行和 20 列。 If you want to pass each "row" of the numpy array as a "column" to the DataFrame, you can simply
transpose
:如果你想将 numpy 数组的每个“行”作为“列”传递给 DataFrame,你可以简单地
transpose
:
df = pd.DataFrame(data=np.random.logistic(10, 1, 120).reshape(6,20).transpose(),
columns=list("abcdef"))
To get the data in a single row, try:要获取一行中的数据,请尝试:
df = pd.DataFrame(columns=list("abcdef"), index=[0])
df.iloc[0] = np.random.logistic(10, 1, 120).reshape(6,20).transpose()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.