简体   繁体   中英

Slicing concatenated dataframe in pandas

I created a dataframe importing multiple text files and concatenating them in a single dataframe, using the following code:

list=[]
for filename in allfiles:
    df = pd.read_csv(filename, index_col=None, header=0,delim_whitespace=True, skipfooter = 1,engine='python')
    list.append(df)

dat = pd.concat(list, axis=0, keys=range(0,len(allfiles))

I now want to create n arrays taking the n element of the second column of each df contained in the larger dataframe, creating more or less a transpose of the second column which contains in the row n all elements n found in the second column of all the different datafiles.

I tried slicing the dataframe dat using .loc() and .iloc() in the following way:

dat.iloc[:,2,n]

but it says that there are not enough indexes.

Here is a short example for dat:

|   |   | a   | b   | c   |  
|---|---|-----|-----|-----|  
| 0 | 0 | 0.1 | 5.3 | 7.2 |  
|   | 1 | 3.2 | 2.5 | 5.4 |  
|   | 2 | 0.3 | 0.5 | 6.2 |  
| 1 | 0 | 6.7 | 4.5 | 7.2 |  
|   | 1 | 9.4 | 6.3 | 5.7 |  
|   | 2 | 6.4 | 4.5 | 6.7 |  
| 2 | 0 | 3.4 | 5.6 | 0.5 |  
|   | 1 | 1.9 | 0.3 | 1.2 |  
|   | 2 | 0.4 | 0.7 | 2.6 |

In the end I would like to obtain arrays of the form:
l_1=[5.3,4.5,5.6],l_2=[2.5,6.3,0.3],l_3=[0.5,4.5,0.7]

df.groupby(level = 1)['b'].apply(list)

0    [5.3, 4.5, 5.6]
1    [2.5, 6.3, 0.3]
2    [0.5, 4.5, 0.7]

You can group by your level 1 of your index, then look at your column 'b', then make them lists.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM