How can I insert this list of data into a pandas DataFrame
orgdata = ['somestring', data[2], data[3], data[4], data[8], data[9], data[10], data[14], data[15], data[16], data[20], data[21], data[22], data[26], data[27], data[28], data[32], data[33], data[34], data[38], data[39], data[40], data[44], data[45], data[46] ]
where 'data' is another list of data out of which i parse specific data.
I have a list of columns names which is also derived from the 'data' list
colnames = ['USN', data[0], data[6], data[12], data[18], data[24], data[30], data[36], data[42]]
Now I need to have three subcolumns under each column, so i do this
cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])
But when i try to insert this list of 'data' into a DataFrame like this
df = pd.DataFrame(orgdata, columns=cols)
I get the following error
ValueError: Wrong number of items passed 1, placement implies 27
Also i get this error
ValueError: Shape of passed values is (1, 25), indices imply (27, 25)
What am I doing wrong? The documentation provided online doesn't give much insight to this topic.
Are there any other ways to work around this? Any help provided is Appreciated.
Edit:
First I make a list of 'data' from the response of a request I made to. Here's an instance of the data i received from the response.
data = ['15EC41', 'LIC', '40', '60', 'P']
This is the sort of data i'm working with.
You need to wrap orgdata
in brackets and ensure that it's length is equal to the number of columns that you have, like so:
df = pd.DataFrame([orgdata], columns=cols)
When you are creating your MultiIndex
, you are passing orgdata
as a list of 25 values (ie Shape of passed values is (1, 25)
). You then define your list colnames
(I'm assuming) as a list of strings, with length 9. Then you create your MultiIndex
using from_product()
with another list of 3 values, therefore giving you indices imply (27, 25)
. The 25
here stems from the fact that you are passing orgdata
to your dataframe constructor as a single list, so it will attempt to parse each individual value as it's own row. You need to wrap this in brackets to ensure that each value will be assigned to a column (since each list in the constructor is interpreted as a single row). Finally, you either need to ensure you have 25
columns to match your passed orgdata
, or pass 27
values inside of orgdata
.
Using your sample data, here is a minimal example:
import pandas as pd
data = ['15EC41', 'LIC', '40', '60', 'P']
orgdata = ['somestring', data[0], data[1], data[2], data[3], data[4]]
colnames = ['USN', data[2]]
cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])
df = pd.DataFrame([orgdata], columns=cols)
Yields:
USN 40
IA EX Total IA EX Total
0 somestring 15EC41 LIC 40 60 P
A bit more complex example to set index:
import pandas as pd
data1 = ['15EC41', 'LIC', '40', '60', 'P']
data2 = ['62F793', 'DUH', '52', '85', 'O']
data3 = ['9734HJ', 'IAS', '34', '94', 'D']
orgdata = [['somestring', i[0], i[1], i[2], i[3], i[4]] for i in [data1, data2, data3]]
colnames = [data1[0], data1[2]]
cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])
df = pd.DataFrame(orgdata, columns=cols)
USN = [0, 1, 2]
df.index = USN; df.index.name = 'USN'
Yields:
15EC41 40
IA EX Total IA EX Total
USN
0 somestring 15EC41 LIC 40 60 P
1 somestring 62F793 DUH 52 85 O
2 somestring 9734HJ IAS 34 94 D
You call DataFrame
with orgdata
which is 25 items => df
is expected to be 25 columns. The columns
argument only specifies the labels for data. Therefore the mismatch, columns
is in effect 27 items.
Can you make clear how would you like to 'insert' data (and not only labels)?
Minimal example I used:
import pandas as pd
data = range(50)
# 25 items
orgdata = ['somestring', data[2], data[3], data[4], data[8], data[9], data[10], data[14], data[15], data[16], data[20], data[21], data[22], data[26], data[27], data[28], data[32], data[33], data[34], data[38], data[39], data[40], data[44], data[45], data[46] ]
# 9 items
colnames = ['USN', data[0], data[6], data[12], data[18], data[24], data[30], data[36], data[42]]
#27 items
cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])
#giving error
df = pd.DataFrame(orgdata, columns=cols)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.