简体   繁体   中英

How to Insert a List of Data into Pandas Multi-Index Dataframe

How can I insert this list of data into a pandas DataFrame

orgdata = ['somestring', data[2], data[3], data[4], data[8], data[9], data[10], data[14], data[15], data[16], data[20], data[21], data[22], data[26], data[27], data[28], data[32], data[33], data[34], data[38], data[39], data[40], data[44], data[45], data[46] ]

where 'data' is another list of data out of which i parse specific data.

I have a list of columns names which is also derived from the 'data' list

colnames = ['USN', data[0], data[6], data[12], data[18], data[24], data[30], data[36], data[42]]

Now I need to have three subcolumns under each column, so i do this

cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])

But when i try to insert this list of 'data' into a DataFrame like this

df = pd.DataFrame(orgdata, columns=cols)

I get the following error

ValueError: Wrong number of items passed 1, placement implies 27

Also i get this error

ValueError: Shape of passed values is (1, 25), indices imply (27, 25)

What am I doing wrong? The documentation provided online doesn't give much insight to this topic.

Are there any other ways to work around this? Any help provided is Appreciated.

Edit:

First I make a list of 'data' from the response of a request I made to. Here's an instance of the data i received from the response.

data = ['15EC41', 'LIC', '40', '60', 'P']

This is the sort of data i'm working with.

You need to wrap orgdata in brackets and ensure that it's length is equal to the number of columns that you have, like so:

df = pd.DataFrame([orgdata], columns=cols)

When you are creating your MultiIndex , you are passing orgdata as a list of 25 values (ie Shape of passed values is (1, 25) ). You then define your list colnames (I'm assuming) as a list of strings, with length 9. Then you create your MultiIndex using from_product() with another list of 3 values, therefore giving you indices imply (27, 25) . The 25 here stems from the fact that you are passing orgdata to your dataframe constructor as a single list, so it will attempt to parse each individual value as it's own row. You need to wrap this in brackets to ensure that each value will be assigned to a column (since each list in the constructor is interpreted as a single row). Finally, you either need to ensure you have 25 columns to match your passed orgdata , or pass 27 values inside of orgdata .

Using your sample data, here is a minimal example:

import pandas as pd

data = ['15EC41', 'LIC', '40', '60', 'P']

orgdata = ['somestring', data[0], data[1], data[2], data[3], data[4]]

colnames = ['USN', data[2]]

cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])

df = pd.DataFrame([orgdata], columns=cols)

Yields:

          USN                40          
           IA      EX Total  IA  EX Total
0  somestring  15EC41   LIC  40  60     P

A bit more complex example to set index:

import pandas as pd

data1 = ['15EC41', 'LIC', '40', '60', 'P']
data2 = ['62F793', 'DUH', '52', '85', 'O']
data3 = ['9734HJ', 'IAS', '34', '94', 'D']

orgdata = [['somestring', i[0], i[1], i[2], i[3], i[4]] for i in [data1, data2, data3]]

colnames = [data1[0], data1[2]]

cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])

df = pd.DataFrame(orgdata, columns=cols)

USN = [0, 1, 2]

df.index = USN; df.index.name = 'USN'

Yields:

         15EC41                40          
             IA      EX Total  IA  EX Total
USN                                        
0    somestring  15EC41   LIC  40  60     P
1    somestring  62F793   DUH  52  85     O
2    somestring  9734HJ   IAS  34  94     D

You call DataFrame with orgdata which is 25 items => df is expected to be 25 columns. The columns argument only specifies the labels for data. Therefore the mismatch, columns is in effect 27 items.

Can you make clear how would you like to 'insert' data (and not only labels)?

Minimal example I used:

import pandas as pd
data = range(50)
# 25 items
orgdata = ['somestring', data[2], data[3], data[4], data[8], data[9], data[10], data[14], data[15], data[16], data[20], data[21], data[22], data[26], data[27], data[28], data[32], data[33], data[34], data[38], data[39], data[40], data[44], data[45], data[46] ]
# 9 items
colnames = ['USN', data[0], data[6], data[12], data[18], data[24], data[30], data[36], data[42]]
#27 items
cols = pd.MultiIndex.from_product([colnames, ['IA', 'EX', 'Total']])
#giving error
df = pd.DataFrame(orgdata, columns=cols)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM