简体   繁体   中英

Convert a set of pandas dataframes in a list

I am trying to convert a set of pandas dataframes into an unique list,

Here's what I got so far:

import pandas as pd

df1= pd.DataFrame(data={'col1': [1, 2, 5], 'col2': [3, 4, 4]})

df2 = pd.DataFrame(data={'col3':[1,2,3,4,5], 'col4':[1,2,'NA', 'NA', 'NA'], 'col5':['John', 'Mary', 'Gordon', 'Cynthia', 'Marianne']})

df3 = pd.DataFrame(data={'col6':[19, 25,20, 23]})

#### attempt to convert into a list ####
df_list = list(df1, df2, df3)

Error:

TypeError: list expected at most 1 arguments, got 3

Expected output should return the indexed dataframe name as an element of the list, something like print(df_list['df1']) would return df1 columns and rows.

Is there any way to accomplish this task?

The use of list() is incorrect here as that doesn't group the arguments into a list. You can instead just use [] :

df_list = [df1, df2, df3]

But a list cannot be indexed with a name, so you maybe want a dict :

df_dict = {'df1':df1, 'df2':df2, 'df3':df3}

Then you can do df_dict['df1'] .

Just note that you are not able to programmatically use the variable names ( df1 , df2 , df3 ) in order to construct the strings used to access them ( 'df1' , 'df2' , 'df3' ).

It's not possible to use string indices with a list in python. Lists have numeric indices starting from 0 up to len(my_list)-1 .

If you were to use the list() call itself, it requires an iterable variable:

>>> help(list)

class list(object)                                                 
 |  list() -> new empty list                                       
 |  list(iterable) -> new list initialized from iterable's items   

So you could construct a tuple and pass that to the list() class like:

>>> my_list = list((df1, df2, df3))
>>> type(my_list) 
<class 'list'>
>>> my_list[0]
... df1 outputs here ... 

But a simpler, and cleaner, way to do it is using the square brackets notation:

>>> my_list = [df1, df2, df3]
>>> type(all_dataframes)
<class 'list'>

However, if you want to use string indices, then think about using a dictionary ie the dict class:

>>> help(dict)

class dict(object)                                                             
 |  dict() -> new empty dictionary                                             
 |  dict(mapping) -> new dictionary initialized from a mapping object's        
 |      (key, value) pairs                                                     
 |  dict(iterable) -> new dictionary initialized as if via:                    
 |      d = {}                                                                 
 |      for k, v in iterable:                                                  
 |          d[k] = v                                                           
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs     
 |      in the keyword argument list.  For example:  dict(one=1, two=2)        
 |                                                                             
 |  Methods defined here:                                                      
 |          

Calling the dict() class directly, you'd want something like this:

>>> all_dataframes = dict(("df1", df1), ("df2", df2), ("df3", df3))
>>> type(all_dataframes)
<class 'dict'>
>>> all_dataframes["df1"]
... df1 output prints here ...

But, the simpler and clearer method would be:

>>> all_dataframes = {"df1": df1, "df2": df2, "df3": df3}
>>> type(all_dataframes)
<class 'dict'>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM