I am trying to convert a list of arrays of varying shapes to a dataframe.
import numpy as np
import pandas as pd
data = [np.array([[1, 2], [1, 3], [1, 1]]),
np.array([[1, 2, 3], [3, 1, 2], [3, 2, 1]])]
names = ['A', 'B']
df = pd.DataFrame(data=data, columns=names)
df
However, this gives the error-
ValueError: Shape of passed values is (2, 1), indices imply (2, 2)
I then tried-
df = pd.DataFrame(np.array([None, *data], dtype=object)[1:]).T
df
0 1
0 [[1, 2], [1, 3], [1, 1]] [[1, 2, 3], [3, 1, 2], [3, 2, 1]]
Which is not my desired output.
I want each inner list in as separate rows, like the following:
A B
0 [1, 2] [1, 2, 3]
1 [1, 3] [3, 1, 2]
2 [1, 1] [3, 2, 1]
Not sure how to proceed.
Try:
pd.DataFrame(dict((k,list(v)) for k,v in zip(names, data)))
Output:
A B
0 [1, 2] [1, 2, 3]
1 [1, 3] [3, 1, 2]
2 [1, 1] [3, 2, 1]
this is what worked for me, istead of sending the data as nasted lists i sended a dictionary which define its values for each column name, this way pandas didnt converted it to 3 columns:
data = [array([[1, 2],
[1, 3],
[1, 1]]),
array([[1, 2, 3],
[3, 1, 2],
[3, 2, 1]])]
names = ['A', 'B']
pd.DataFrame({name:l.tolist() for name,l in zip(names,data)})
Out[5]:
A B
0 [1, 2] [1, 2, 3]
1 [1, 3] [3, 1, 2]
2 [1, 1] [3, 2, 1]
the wrong way
pd.DataFrame(data)
>>>
0
0 [[1, 2], [1, 3], [1, 1]]
1 [[1, 2, 3], [3, 1, 2], [3, 2, 1]]
# or
pd.DataFrame([l.tolist() for l in data])
>>>
0 1 2
0 [1, 2] [1, 3] [1, 1]
1 [1, 2, 3] [3, 1, 2] [3, 2, 1]
Let us try concat
, it will do it one by one sub-data
out = pd.concat([pd.Series(list(x)) for x in data], keys=names, axis=1)
A B
0 [1, 2] [1, 2, 3]
1 [1, 3] [3, 1, 2]
2 [1, 1] [3, 2, 1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.