简体   繁体   中英

Splitting pandas table based on the first row in Python

I have a pandas table:

Data   Years  Y
A      2001   3
A      2007   5
A      2002   8
A      2009   1
B      2001   8
В      2004   5
С      2004   4
С      2006   6
С      2005   9

How can I analyze all the data for A, B and C separately? For example, histogram of each Data per Year in one plot? Should it be something with pivot table or not?

You can try pivot :

print df
  Data  Years  Y
0    A   2001  3
1    A   2007  5
2    A   2002  8
3    A   2009  1
4    B   2001  8
5    B   2004  5
6    C   2004  4
7    C   2006  6
8    C   2005  9

df1 = df.pivot(index='Data', columns='Years', values='Y')
print df1
Years  2001  2002  2004  2005  2006  2007  2009
Data                                           
A       3.0   8.0   NaN   NaN   NaN   5.0   1.0
B       8.0   NaN   5.0   NaN   NaN   NaN   NaN
C       NaN   NaN   4.0   9.0   6.0   NaN   NaN

If you need count not NaN values, use notnull and then convert boolean DataFrame to int by astype :

print df1.notnull().astype(int)
Years  2001  2002  2004  2005  2006  2007  2009
Data                                           
A         1     1     0     0     0     1     1
B         1     0     1     0     0     0     0
C         0     0     1     1     1     0     0

If you have duplicates data in column Years , you can use pivot_table with aggfunc , eg sum . I have duplicates in row 2 and 3 :

print df
  Data  Years   Y
0    A   2001   3
1    A   2007   5
2    A   2002   8
3    A   2002  10
4    A   2009   1
5    B   2001   8
6    B   2004   5
7    C   2004   4
8    C   2006   6
9    C   2005   9

print df.pivot_table(index='Data', columns='Years', values='Y', aggfunc=sum)
Years  2001  2002  2004  2005  2006  2007  2009
Data                                           
A       3.0  18.0   NaN   NaN   NaN   5.0   1.0
B       8.0   NaN   5.0   NaN   NaN   NaN   NaN
C       NaN   NaN   4.0   9.0   6.0   NaN   NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM